S&P 500 Stock Price Prediction Using Technical, Fundamental and Text Data
Abstract
We summarized both common and novel predictive models used for stock price prediction and combined them with technical indices, fundamental characteristics and text-based sentiment data to predict S&P stock prices. A 66.18% accuracy in S&P 500 index directional prediction and 62.09% accuracy in individual stock directional prediction was achieved by combining different machine learning models such as Random Forest and LSTM together into state-of-the-art ensemble models. The data we use contains weekly historical prices, finance reports, and text information from news items associated with 518 different common stocks issued by current and former S&P 500 large-cap companies, from January 1, 2000 to December 31, 2019. Our study's innovation includes utilizing deep language models to categorize and infer financial news item sentiment; fusing different models containing different combinations of variables and stocks to jointly make predictions; and overcoming the insufficient data problem for machine learning models in time series by using data across different stocks.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2021
- DOI:
- arXiv:
- arXiv:2108.10826
- Bibcode:
- 2021arXiv210810826Z
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning
- E-Print:
- 19 pages, 10 figures