May 11, 2024  
2022-2023 Undergraduate Catalog 
    
2022-2023 Undergraduate Catalog [ARCHIVED CATALOG]

FIN 4230 Text Mining for Finance


This course applies machine learning methods to analyze web-based textual information and uses the parsed information to solve financial problems. Text Mining is an  important  subfield  of  Natural Linguistic Process (NLP). The application of text mining in financial markets, either in industries or in academia, is growing rapidly.  In this course, we will cover two parts:  (1) Basic Textual Analysis: Text preprocessing, Tokenization, data wrangling, Sentiment Analysis, Document-Term Matrix, Text classification, Topic modeling; and (2) Retrieving web-based information: accessing Web API, parsing HTML, Static and Dynamic web scrapping.  This course uses R/RStudio as the main programming tool.  Python is another popular choice and it might be introduced if time permits. Although this course will weigh more on computer coding and data analytics, it will also discuss some basic machine learning theories for textual analysis such as Bayesian learning.  

  Prerequisite(s): ECON 2110  
Credits: 3.0