In this paper, we examine the usefulness of machine learning methods such as support vector machines, random forests and bagging for the extraction of information from the limit order book that can be used for intraday trading. For our empirical analysis, we first get 50 raw features from the LOBSTER message file and order book file of the iShares Core S & P 500 ETF for the time period from 27.06.2007 to 30.04.2019 and then construct 18 higher-level features (aggregated to 5 minutes frequency) which serve as predictors. Using straightforward specifications for the machine learning procedures and thereby avoiding excessive data snooping, we find that these procedures are unable to find high dimensional patterns in the order book that could be used for trading purposes. The observed significant predictability is mainly due to the inclusion of only one variable, namely the last price change, and is probably too small to ensure profitability once transaction costs are taken into account.
Citation: Manveer Kaur Mangat, Erhard Reschenhofer, Thomas Stark, Christian Zwatz. High-Frequency Trading with Machine Learning Algorithms and Limit Order Book Data[J]. Data Science in Finance and Economics, 2022, 2(4): 437-463. doi: 10.3934/DSFE.2022022
In this paper, we examine the usefulness of machine learning methods such as support vector machines, random forests and bagging for the extraction of information from the limit order book that can be used for intraday trading. For our empirical analysis, we first get 50 raw features from the LOBSTER message file and order book file of the iShares Core S & P 500 ETF for the time period from 27.06.2007 to 30.04.2019 and then construct 18 higher-level features (aggregated to 5 minutes frequency) which serve as predictors. Using straightforward specifications for the machine learning procedures and thereby avoiding excessive data snooping, we find that these procedures are unable to find high dimensional patterns in the order book that could be used for trading purposes. The observed significant predictability is mainly due to the inclusion of only one variable, namely the last price change, and is probably too small to ensure profitability once transaction costs are taken into account.
[1] | Cai C, Zhang Q (2016) High-frequency exchange rate forecasting. Eur Financ Manage 22: 120–141. https://doi.org/10.1111/eufm.12052 doi: 10.1111/eufm.12052 |
[2] | Cont R (2011) Statistical modeling of high-frequency financial data. IEEE Signal Proces Mag 28: 16–25. https://doi.org/10.1109/MSP.2011.941548 doi: 10.1109/MSP.2011.941548 |
[3] | Fletcher T, Shawe-Taylor J (2013) Multiple kernel learning with fisher kernels for high frequency currency prediction. Computat Econ 42: 217–240. https://doi.org/10.1007/s10614-012-9317-z doi: 10.1007/s10614-012-9317-z |
[4] | Frömmel M, Lampaert K (2016) Does frequency matter for intraday technical trading? Financ Res Lett 18: 177–183. https://doi.org/10.1016/j.frl.2016.04.014 doi: 10.1016/j.frl.2016.04.014 |
[5] | Gao K, Luk W, Weston S (2021) High-Frequency Trading and Financial Time-Series Prediction with Spiking Neural Networks. Wilmott 18–33. https://doi.org/10.1002/wilm.10927 doi: 10.1002/wilm.10927 |
[6] | Han J, Hong J, Sutardja N, et al. (2015) Machine learning techniques for price change forecast using the limit order book data. Working Paper. |
[7] | Hansen P, Lunde A (2006) Realized variance and market microstructure noise. J Bus Econ Stat 24: 127–161. https://doi.org/10.1198/073500106000000071 doi: 10.1198/073500106000000071 |
[8] | Huang W, Nakamori Y, Wang S (2005) Forecasting stock market movement direction with support vector machine. Comput opera res 32: 2513–2522. https://doi.org/10.1016/j.cor.2004.03.016 doi: 10.1016/j.cor.2004.03.016 |
[9] | Kearns M, Nevmyvaka Y (2013) Machine learning for market microstructure and high frequency trading. High Frequency Trading: New Realities for Traders, Markets, and Regulators Risk Books London, UK |
[10] | Kercheval A, Zhang Y (2015) Modelling high-frequency limit order book dynamics with support vector machine. Quantitat Financ 15: 1315–1329. https://doi.org/10.1080/14697688.2015.1032546 doi: 10.1080/14697688.2015.1032546 |
[11] | Krämer W (1998) Note Short-term predictability of German stock returns. Empir Econ 23: 635–639. https://doi.org/10.1007/s001810050040 doi: 10.1007/s001810050040 |
[12] | Krollner B, Vanstone Bruce, Finnie G (2010) inancial time series forecasting with machine learning techniques: a survey. ESANN, |
[13] | Lakonishok J and Smidt S (1988) Are seasonal anomalies real? A ninety-year perspective. Rev financ stud 1: 403–425. https://doi.org/10.1093/rfs/1.4.403 doi: 10.1093/rfs/1.4.403 |
[14] | LOBSTER: Limit Order Book System - The Efficient Reconstructer. LOBSTER Team. Available from: https://lobsterdata.com/info/DataStructure.php. |
[15] | Majhi R, Panda G, Sahoo G (2009) Development and performance evaluation of FLANN based model for forecasting of stock markets. Expert syst Appl 36: 6800–6808. https://doi.org/10.1016/j.eswa.2008.08.008 doi: 10.1016/j.eswa.2008.08.008 |
[16] | Nousi P, Tsantekidis A, Passalis N, et al. (2019) Machine learning for forecasting mid-price movements using limit order book data. Ieee Access 7: 64722–64736. https://doi.org/10.1109/ACCESS.2019.2916793 doi: 10.1109/ACCESS.2019.2916793 |
[17] | Ntakaris A, Magris M, Kanniainen J, et al. (2018) Benchmark dataset for mid-price forecasting of limit order book data with machine learning methods. J Forecast 37: 852–866. https://doi.org/10.1002/for.2543 doi: 10.1002/for.2543 |
[18] | R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing. Available from: http://www.R-project.org/. |
[19] | Reschenhofer E (2010) Further evidence on the turn-of-the-month effect. Bus Econ J. |
[20] | Reschenhofer E, Mangat M, Stark T (2020a), Volatility forecasts, proxies and loss functions J Empir Financ 59: 133–153. https://doi.org/10.1016/j.jempfin.2020.09.006 doi: 10.1016/j.jempfin.2020.09.006 |
[21] | Reschenhofer E, Mangat M, Zwatz C et al. (2020b), Evaluation of current research on stock return predictability. J Forecast 39: 334–351. https://doi.org/10.1002/for.2629 doi: 10.1002/for.2629 |
[22] | Tay F, Cao L (2019) Application of support vector machines in financial time series forecasting. omega 29: 309–317. https://doi.org/10.1016/S0305-0483(01)00026-3 doi: 10.1016/S0305-0483(01)00026-3 |
[23] | Tran D, Kanniainen J, Gabbouj M et al. (2019) Data-driven neural architecture learning for financial time-series forecasting. arXiv preprint arXiv: 1903.06751 https://doi.org/10.48550/arXiv.1903.06751 |
[24] | Zheng B, Moulines E andAbergel F (2012) Price jump prediction in limit order book arXiv preprint arXiv: 1204.1381 https://doi.org/10.48550/arXiv.1204.1381 |
DSFE-02-04-022-s001.pdf |