We analyze how the sentiment of financial news can be used to predict stock returns and build profitable trading strategies. Combining the textual analysis of financial news headlines and statistical methods, we build multi-class classification models to predict the stock return. The main contribution of this paper is twofold. Firstly, we develop a performance evaluation metric to compare multi-class classification methods, taking into account the precision and accuracy of the models and methods. By maximizing the metric, we find optimal combinations of models and methods and select the best approach for prediction and decision-making. Secondly, this metric enables us to construct profitable option trading strategies, which can also be used as an assessment tool to analyze models' prediction power. We apply our methodology to historical data from Apple stock and financial news headlines from Reuters from January 1, 2012 to May 31, 2019. During validation (May 31, 2018, to May 31, 2019), our models consistently outperformed the market, with two-class one-stage models yielding returns between 30% and 45%, compared to the S & P500 index's 1.73% return over the same period.
Citation: Jiawei He, Roman N. Makarov, Jake Tuero, Zilin Wang. Performance evaluation metric for statistical learning trading strategies[J]. Data Science in Finance and Economics, 2024, 4(4): 570-600. doi: 10.3934/DSFE.2024024
[1] | Yicang Zhou, Zhien Ma . Global stability of a class of discrete age-structured SIS models with immigration. Mathematical Biosciences and Engineering, 2009, 6(2): 409-425. doi: 10.3934/mbe.2009.6.409 |
[2] | Yanxia Dang, Zhipeng Qiu, Xuezhi Li . Competitive exclusion in an infection-age structured vector-host epidemic model. Mathematical Biosciences and Engineering, 2017, 14(4): 901-931. doi: 10.3934/mbe.2017048 |
[3] | Zhiping Liu, Zhen Jin, Junyuan Yang, Juan Zhang . The backward bifurcation of an age-structured cholera transmission model with saturation incidence. Mathematical Biosciences and Engineering, 2022, 19(12): 12427-12447. doi: 10.3934/mbe.2022580 |
[4] | Toshikazu Kuniya, Hisashi Inaba . Hopf bifurcation in a chronological age-structured SIR epidemic model with age-dependent infectivity. Mathematical Biosciences and Engineering, 2023, 20(7): 13036-13060. doi: 10.3934/mbe.2023581 |
[5] | Azmy S. Ackleh, Keng Deng, Yixiang Wu . Competitive exclusion and coexistence in a two-strain pathogen model with diffusion. Mathematical Biosciences and Engineering, 2016, 13(1): 1-18. doi: 10.3934/mbe.2016.13.1 |
[6] | Churni Gupta, Necibe Tuncer, Maia Martcheva . A network immuno-epidemiological model of HIV and opioid epidemics. Mathematical Biosciences and Engineering, 2023, 20(2): 4040-4068. doi: 10.3934/mbe.2023189 |
[7] | Tsuyoshi Kajiwara, Toru Sasaki, Yoji Otani . Global stability of an age-structured infection model in vivo with two compartments and two routes. Mathematical Biosciences and Engineering, 2022, 19(11): 11047-11070. doi: 10.3934/mbe.2022515 |
[8] | Xiaodan Sun, Yanni Xiao, Zhihang Peng . Modelling HIV superinfection among men who have sex with men. Mathematical Biosciences and Engineering, 2016, 13(1): 171-191. doi: 10.3934/mbe.2016.13.171 |
[9] | Abba B. Gumel, Baojun Song . Existence of multiple-stable equilibria for a multi-drug-resistant model of mycobacterium tuberculosis. Mathematical Biosciences and Engineering, 2008, 5(3): 437-455. doi: 10.3934/mbe.2008.5.437 |
[10] | Azizeh Jabbari, Carlos Castillo-Chavez, Fereshteh Nazari, Baojun Song, Hossein Kheiri . A two-strain TB model with multiplelatent stages. Mathematical Biosciences and Engineering, 2016, 13(4): 741-785. doi: 10.3934/mbe.2016017 |
We analyze how the sentiment of financial news can be used to predict stock returns and build profitable trading strategies. Combining the textual analysis of financial news headlines and statistical methods, we build multi-class classification models to predict the stock return. The main contribution of this paper is twofold. Firstly, we develop a performance evaluation metric to compare multi-class classification methods, taking into account the precision and accuracy of the models and methods. By maximizing the metric, we find optimal combinations of models and methods and select the best approach for prediction and decision-making. Secondly, this metric enables us to construct profitable option trading strategies, which can also be used as an assessment tool to analyze models' prediction power. We apply our methodology to historical data from Apple stock and financial news headlines from Reuters from January 1, 2012 to May 31, 2019. During validation (May 31, 2018, to May 31, 2019), our models consistently outperformed the market, with two-class one-stage models yielding returns between 30% and 45%, compared to the S & P500 index's 1.73% return over the same period.
[1] |
Abdi H, Williams LJ (2010) Principal component analysis. Wires Comput Stat 2: 433–459. https://doi.org/10.1002/wics.101 doi: 10.1002/wics.101
![]() |
[2] | Abdul-Rauf S, Kiani K, Zafar A, et al. (2019) Exploring transfer learning and domain data selection for the biomedical translation. In Proceedings of the Fourth Conference on Machine Translation, 3: 156–163. https://doi.org/10.18653/v1/W19-5419 |
[3] |
Ashtiani MN, Raahemi B (2023) News-based intelligent prediction of financial markets using text mining and machine learning: A systematic literature review. Expert Syst Appl 217: 119509. https://doi.org/10.1016/j.eswa.2023.119509 doi: 10.1016/j.eswa.2023.119509
![]() |
[4] |
Ballabio D, Grisoni F, Todeschini R (2018) Multivariate comparison of classification performance measures. Chemometr Intell Lab 174: 33–44. https://doi.org/10.1016/j.chemolab.2017.12.004 doi: 10.1016/j.chemolab.2017.12.004
![]() |
[5] |
Barucci E, Bonollo M, Poli F, et al. (2021) A machine learning algorithm for stock picking built on information based outliers. Expert Syst Appl 184: 115497. https://doi.org/10.1016/j.eswa.2021.115497 doi: 10.1016/j.eswa.2021.115497
![]() |
[6] | Campolieti G, Makarov RN (2021) Financial Mathematics: A Comprehensive Treatment in Discrete Time. CRC Press. https://doi.org/10.1201/9781315373768 |
[7] |
Duz Tan S, Tas O (2021) Social media sentiment in international stock returns and trading activity. J Behav Financ 22: 221–234. https://doi.org/10.1080/15427560.2020.1772261 doi: 10.1080/15427560.2020.1772261
![]() |
[8] |
Frattini A, Bianchini I, Garzonio A, et al. (2022) Financial technical indicator and algorithmic trading strategy based on machine learning and alternative data. Risks 10: 225. https://doi.org/10.3390/risks10120225 doi: 10.3390/risks10120225
![]() |
[9] | Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33: 1–22. |
[10] | Grandini M, Bagli E, Visani G (2020) Metrics for multi-class classification: an overview. arXiv Preprint. https://doi.org/10.48550/arXiv.2008.05756 |
[11] |
Heston SL, Sinha NR (2017) News vs. sentiment: Predicting stock returns from news stories. Financ Anal J 73: 67–83. https://doi.org/10.2469/faj.v73.n3.3 doi: 10.2469/faj.v73.n3.3
![]() |
[12] |
Hoo ZH, Candlish J, Teare D (2017) What is an roc curve? Emerg Med J 34: 357–359. https://doi.org/10.1136/emermed-2017-206735 doi: 10.1136/emermed-2017-206735
![]() |
[13] | Hutto C, Gilbert E (2014) VADER: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the international AAAI conference on web and social media, 8: 216–225. https://doi.org/10.1609/icwsm.v8i1.14550 |
[14] |
Li X, Xie H, Chen L, et al. (2014) News impact on stock price return via sentiment analysis. Knowl-Based Syst 69: 14–23. https://doi.org/10.1016/j.knosys.2014.04.022 doi: 10.1016/j.knosys.2014.04.022
![]() |
[15] | Mohan S, Mullapudi S, Sammeta S, et al. (2019) Stock price prediction using news sentiment analysis. In 2019 IEEE fifth international conference on big data computing service and applications (BigDataService), 205–208. IEEE. https://doi.org/10.1109/BigDataService.2019.00035 |
[16] |
Nazareth N, Reddy YVR (2023) Financial applications of machine learning: A literature review. Expert Syst Appl 219: 119640. https://doi.org/10.1016/j.eswa.2023.119640 doi: 10.1016/j.eswa.2023.119640
![]() |
[17] |
Nevasalmi L (2020) Forecasting multinomial stock returns using machine learning methods. J Financ Data Sci 6: 86–106. https://doi.org/10.1016/j.jfds.2020.09.001 doi: 10.1016/j.jfds.2020.09.001
![]() |
[18] |
Nti IK, Adekoya AF, Weyori BA (2020) A systematic review of fundamental and technical analysis of stock market predictions. Artif Intell Rev 53: 3007–3057. https://doi.org/10.1007/s10462-019-09754-z doi: 10.1007/s10462-019-09754-z
![]() |
[19] | Plisson J, Lavrac N, Mladenic D (2004) A rule based approach to word lemmatization. In Proceedings of IS, 3: 83–86. |
[20] | Refaeilzadeh P, Tang L, Liu H (2009) Cross-validation. Encyclopedia database syst 5: 532–538. |
[21] | Shah D, Isah H, Zulkernine F (2018) Predicting the effects of news sentiments on the stock market. In 2018 IEEE International Conference on Big Data (Big Data), 4705–4708, IEEE. https://doi.org/10.1109/BigData.2018.8621884 |
[22] |
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inform Process Manag 45: 427–437. https://doi.org/10.1016/j.ipm.2009.03.002 doi: 10.1016/j.ipm.2009.03.002
![]() |
[23] |
Stoltzfus JC (2011) Logistic regression: a brief primer. Acad Emerg Med 18: 1099–1104. https://doi.org/10.1111/j.1553-2712.2011.01185.x doi: 10.1111/j.1553-2712.2011.01185.x
![]() |
[24] |
Swiderski B, Kurek J, Osowski S (2012) Multistage classification by using logistic regression and neural networks for assessment of financial condition of company. Decis Support Syst 52: 539–547. https://doi.org/10.1016/j.dss.2011.10.018 doi: 10.1016/j.dss.2011.10.018
![]() |
[25] | Tang D, Qin B, Feng X, et al. (2015) Effective LSTMs for target-dependent sentiment classification. arXiv Preprint. https://doi.org/10.48550/arXiv.1512.01100 |
[26] | Team RC (2013) R: A language and environment for statistical computing. R foundation for statistical computing, vienna, austria. Available from: http://www.R-project.org/. |
[27] |
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58: 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x doi: 10.1111/j.2517-6161.1996.tb02080.x
![]() |
[28] | Ukil A (2007) Support vector machine. In Intelligent Systems and Signal Processing in Power Engineering, 161–226. Springer. |
[29] |
Wainer J, Cawley G (2021) Nested cross-validation when selecting classifiers is overzealous for most practical applications. Expert Syst Appl 182: 115222. https://doi.org/10.1016/j.eswa.2021.115222 doi: 10.1016/j.eswa.2021.115222
![]() |
[30] |
Yang SY, Mo SYK, Liu A, et al. (2017) Genetic programming optimization for a sentiment feedback strength based trading strategy. Neurocomputing 264: 29–41. https://doi.org/10.1016/j.neucom.2016.10.103 doi: 10.1016/j.neucom.2016.10.103
![]() |
[31] | Zhang W, Skiena S (2010) Trading strategies to exploit blog and news sentiment. In Fourth international aAAI conference on weblogs and social media, 4: 375–378. |
1. | E. Numfor, S. Bhattacharya, S. Lenhart, M. Martcheva, S. Anita, N. Hritonenko, G. Marinoschi, A. Swierniak, Optimal Control in Coupled Within-host and Between-host Models, 2014, 9, 0973-5348, 171, 10.1051/mmnp/20149411 | |
2. | Lin Zhao, Zhi-Cheng Wang, Shigui Ruan, Traveling wave solutions in a two-group epidemic model with latent period, 2017, 30, 0951-7715, 1287, 10.1088/1361-6544/aa59ae | |
3. | Rony Izhar, Jarkko Routtu, Frida Ben-Ami, Host age modulates within-host parasite competition, 2015, 11, 1744-9561, 20150131, 10.1098/rsbl.2015.0131 | |
4. | Tufail Malik, Abba Gumel, Elamin H. Elbasha, Qualitative analysis of an age- and sex-structured vaccination model for human papillomavirus, 2013, 18, 1553-524X, 2151, 10.3934/dcdsb.2013.18.2151 | |
5. | Robert Rowthorn, Selma Walther, The optimal treatment of an infectious disease with two strains, 2017, 74, 0303-6812, 1753, 10.1007/s00285-016-1074-5 | |
6. | Jemal Mohammed-Awel, Eric Numfor, Ruijun Zhao, Suzanne Lenhart, A new mathematical model studying imperfect vaccination: Optimal control analysis, 2021, 500, 0022247X, 125132, 10.1016/j.jmaa.2021.125132 | |
7. | Mohammad A. Safi, Abba B. Gumel, Elamin H. Elbasha, Qualitative analysis of an age-structured SEIR epidemic model with treatment, 2013, 219, 00963003, 10627, 10.1016/j.amc.2013.03.126 | |
8. | S.M. Garba, M.A. Safi, A.B. Gumel, Cross-immunity-induced backward bifurcation for a model of transmission dynamics of two strains of influenza, 2013, 14, 14681218, 1384, 10.1016/j.nonrwa.2012.10.003 | |
9. | Toshikazu Kuniya, Jinliang Wang, Hisashi Inaba, A multi-group SIR epidemic model with age structure, 2016, 21, 1531-3492, 3515, 10.3934/dcdsb.2016109 | |
10. | Roberto Cavoretto, Simona Collino, Bianca Giardino, Ezio Venturino, A two-strain ecoepidemic competition model, 2015, 8, 1874-1738, 37, 10.1007/s12080-014-0232-x | |
11. | Eminugroho Ratna Sari, Fajar Adi-Kusumo, Lina Aryati, Mathematical analysis of a SIPC age-structured model of cervical cancer, 2022, 19, 1551-0018, 6013, 10.3934/mbe.2022281 | |
12. | Chin-Lung Li, Chang-Yuan Cheng, Chun-Hsien Li, Global dynamics of two-strain epidemic model with single-strain vaccination in complex networks, 2023, 69, 14681218, 103738, 10.1016/j.nonrwa.2022.103738 | |
13. | S.Y. Tchoumi, H. Rwezaura, J.M. Tchuenche, Dynamic of a two-strain COVID-19 model with vaccination, 2022, 39, 22113797, 105777, 10.1016/j.rinp.2022.105777 | |
14. | Ting Cui, Peijiang Liu, Fractional transmission analysis of two strains of influenza dynamics, 2022, 40, 22113797, 105843, 10.1016/j.rinp.2022.105843 | |
15. | Shasha Gao, Mingwang Shen, Xueying Wang, Jin Wang, Maia Martcheva, Libin Rong, A multi-strain model with asymptomatic transmission: Application to COVID-19 in the US, 2023, 565, 00225193, 111468, 10.1016/j.jtbi.2023.111468 | |
16. | Md. Mamun-Ur-Rashid Khan, Md. Rajib Arefin, Jun Tanimoto, Time delay of the appearance of a new strain can affect vaccination behavior and disease dynamics: An evolutionary explanation, 2023, 24680427, 10.1016/j.idm.2023.06.001 | |
17. | Yucui Wu, Zhipeng Zhang, Limei Song, Chengyi Xia, Global stability analysis of two strains epidemic model with imperfect vaccination and immunity waning in a complex network, 2024, 179, 09600779, 114414, 10.1016/j.chaos.2023.114414 | |
18. | 彦锦 吉, Studies with Vaccination and Asymptomatic Transmission Models, 2024, 14, 2160-7583, 424, 10.12677/pm.2024.145197 | |
19. | Mohammadi Begum Jeelani, Rahim Ud Din, Ghaliah Alhamzi, Manel Hleili, Hussam Alrabaiah, Deterministic and Stochastic Nonlinear Model for Transmission Dynamics of COVID-19 with Vaccinations Following Bayesian-Type Procedure, 2024, 12, 2227-7390, 1662, 10.3390/math12111662 | |
20. | Taqi A.M. Shatnawi, Stephane Y. Tchoumi, Herieth Rwezaura, Khalid Dib, Jean M. Tchuenche, Mo’tassem Al-arydah, A two-strain COVID-19 co-infection model with strain 1 vaccination, 2024, 26668181, 100945, 10.1016/j.padiff.2024.100945 | |
21. | Riya Das, Dhiraj Kumar Das, T K Kar, Analysis of a chronological age-structured epidemic model with a pair of optimal treatment controls, 2024, 99, 0031-8949, 125240, 10.1088/1402-4896/ad8e0b | |
22. | Xi-Chao Duan, Chenyu Zhu, Xue-Zhi Li, Eric Numfor, Maia Martcheva, Dynamics and optimal control of an SIVR immuno-epidemiological model with standard incidence, 2025, 0022247X, 129449, 10.1016/j.jmaa.2025.129449 |