Research article

Sentiment-enhanced rice price forecasting under sparse social-media coverage: Evidence from Saudi rice imports

  • Published: 06 May 2026
  • JEL Codes: C22, C53, Q11, Q13

  • This study examines whether carefully lagged social-media sentiment adds incremental forecast information to Saudi rice import prices once temperature-based climate variables and historical price dynamics are already included. We construct a monthly series for premium Basmati and standard Maza rice using customs-cleared price data, station-based temperature measures, and Arabic Twitter content labeled for sentiment with GPT-4. Because the retained tweet signal is sparse after relevance and engagement filtering, forecasts are evaluated in a constrained monthly setting using seasonal autoregressive integrated moving average with exogenous regressors (SARIMAX), linear regression (ordinary least squares), ridge regression, gradient boosting, and random forest under a time-ordered validation design with a held-out test period. Sentiment features improve forecast accuracy most consistently for Basmati, with smaller gains for Maza; the strongest overall results are achieved by transparent linear models and sentiment-enhanced SARIMAX rather than by the more complex tree ensembles. The findings indicate that large language model-assisted sentiment labeling can provide useful demand-side information when paired with disciplined feature engineering, while the magnitude of the benefit depends on commodity type, data coverage, and market structure.

    Citation: Saad Alqithami, Musaad Alzahrani. Sentiment-enhanced rice price forecasting under sparse social-media coverage: Evidence from Saudi rice imports[J]. Data Science in Finance and Economics, 2026, 6(2): 250-276. doi: 10.3934/DSFE.2026009

    Related Papers:

  • This study examines whether carefully lagged social-media sentiment adds incremental forecast information to Saudi rice import prices once temperature-based climate variables and historical price dynamics are already included. We construct a monthly series for premium Basmati and standard Maza rice using customs-cleared price data, station-based temperature measures, and Arabic Twitter content labeled for sentiment with GPT-4. Because the retained tweet signal is sparse after relevance and engagement filtering, forecasts are evaluated in a constrained monthly setting using seasonal autoregressive integrated moving average with exogenous regressors (SARIMAX), linear regression (ordinary least squares), ridge regression, gradient boosting, and random forest under a time-ordered validation design with a held-out test period. Sentiment features improve forecast accuracy most consistently for Basmati, with smaller gains for Maza; the strongest overall results are achieved by transparent linear models and sentiment-enhanced SARIMAX rather than by the more complex tree ensembles. The findings indicate that large language model-assisted sentiment labeling can provide useful demand-side information when paired with disciplined feature engineering, while the magnitude of the benefit depends on commodity type, data coverage, and market structure.



    加载中


    [1] Akaike H (1974) A new look at the statistical model identification. Ieee T Automat Contr 19: 716–723. https://doi.org/10.1109/TAC.1974.1100705 doi: 10.1109/TAC.1974.1100705
    [2] Alamah Z, Elgammal W, Fakih A (2024) Does twitter economic uncertainty matter for wheat prices? Econ Lett 234: 111463. https://doi.org/10.1016/j.econlet.2023.111463 doi: 10.1016/j.econlet.2023.111463
    [3] An W, Wang L, Zeng YR (2024) Social media-based multi-modal ensemble framework for forecasting soybean futures price. Comput Electron Agr 226: 109439. https://doi.org/10.1016/j.compag.2024.109439 doi: 10.1016/j.compag.2024.109439
    [4] Antoun W, Baly F, Hajj H (2020) AraBERT: Transformer-based model for arabic language understanding. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, Marseille, France, European Language Resource Association: 9–15. Available from: https://aclanthology.org/2020.osact-1.2/
    [5] Araci D (2019) FinBERT: Financial sentiment analysis with pre-trained language models. arXiv preprint. https://doi.org/10.48550/arXiv.1908.10063
    [6] Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2: 1–8. https://doi.org/10.1016/j.jocs.2010.12.007 doi: 10.1016/j.jocs.2010.12.007
    [7] Bonato M, Cepni O, Gupta R, et al. (2024) Forecasting the realized volatility of agricultural commodity prices: Does sentiment matter? J Forecast 43: 2088–2125. https://doi.org/10.1002/for.3106 doi: 10.1002/for.3106
    [8] Box GEP, Jenkins GM, Reinsel GC, et al. (2015) Time Series Analysis: Forecasting and Control. 5th edition. John Wiley & Sons, Hoboken, NJ.
    [9] Breiman L (2001) Random forests. Mach Learn 45: 5–32. https://doi.org/10.1023/A:1010933404324 doi: 10.1023/A:1010933404324
    [10] Cai Y, Guan K, Lobell D, et al. (2019) Integrating satellite and climate data to predict wheat yield in australia using machine learning approaches. Agr Forest Meteorol 274: 144–159. https://doi.org/10.1016/j.agrformet.2019.03.010 doi: 10.1016/j.agrformet.2019.03.010
    [11] Chen T, Guestrin C (2016) XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
    [12] Devlin J, Chang MW, Lee K, et al. (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1: 4171–4186. https://doi.org/10.18653/v1/N19-1423
    [13] Diebold FX, Mariano RS (1995) Comparing predictive accuracy. J Bus Econ Stat 13: 253–263. https://doi.org/10.1080/07350015.1995.10524599 doi: 10.1080/07350015.1995.10524599
    [14] Ewald CO, Li Y (2024) The role of news sentiment in salmon price prediction using deep learning. J Commod Mark 36: 100438. https://doi.org/10.1016/j.jcomm.2024.100438 doi: 10.1016/j.jcomm.2024.100438
    [15] FAO, IFAD, UNICEF, et al. (2023) The state of food security and nutrition in the world 2023. Food and Agriculture Organization of the United Nations, Rome. https://doi.org/10.4060/cc3017en
    [16] Food and Agriculture Organization of the United Nations (2026) Food Price Monitoring and Analysis (FPMA) Tool. Available from: https://fpma.fao.org/.
    [17] Friedman JH (2001) Greedy function approximation: A gradient boosting machine. Ann Stat 29: 1189–1232. https://doi.org/10.1214/aos/1013203451 doi: 10.1214/aos/1013203451
    [18] Ge L, Huang Q, Zhu F, et al. (2025) Advanced time series forecasting for commodities: Insights from the FEDformer model. Energy Econ 147: 108513. https://doi.org/10.1016/j.eneco.2025.108513 doi: 10.1016/j.eneco.2025.108513
    [19] Gelman A, Carlin JB, Stern HS, et al. (2013) Bayesian Data Analysis. Chapman and Hall/CRC. https://doi.org/10.1201/b16018
    [20] Hyndman RJ, Athanasopoulos G (2021) Forecasting: Principles and Practice 3rd edition. OTexts, Melbourne, Australia. Available from: https://otexts.com/fpp3/.
    [21] International Monetary Fund (2026) IMF Data: International financial statistics and consumer price index series. Available from: https://data.imf.org/.
    [22] Joulin A, Grave E, Bojanowski P, et al. (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics 2: 427–431. Available from: https://aclanthology.org/E17-2068/.
    [23] Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: A survey. Comput Electron Agr 147: 70–90. https://doi.org/10.1016/j.compag.2018.02.016 doi: 10.1016/j.compag.2018.02.016
    [24] Kim J, Cha M, Lee JG (2017) Nowcasting commodity prices using social media. Peerj Comput Sci 3: e126. https://doi.org/10.7717/peerj-cs.126 doi: 10.7717/peerj-cs.126
    [25] Liu B (2012) Synthesis Lectures on Human Language Technologies, Sentiment Analysis and Opinion Mining. 5: 1–167. Morgan & Claypool. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
    [26] Lobell DB (2013) The use of satellite data for crop yield gap analysis. Field Crop Res 143: 56–64. https://doi.org/10.1016/j.fcr.2012.08.008 doi: 10.1016/j.fcr.2012.08.008
    [27] Marshall A (2013) Principles of Economics. Palgrave Macmillan. https://doi.org/10.1057/9781137375261
    [28] Mittal A, Goel A (2011) Stock prediction using twitter sentiment analysis. CS229 Project Report, Stanford University. Available from: https://cs229.stanford.edu/proj2011/GoelMittal-StockMarketPredictionUsingTwitterSentimentAnalysis.pdf.
    [29] National Centers for Environmental Information, National Oceanic and Atmospheric Administration (2026) Daily Summaries. Available from: https://www.ncei.noaa.gov/access/search/data-search/dailysummaries.
    [30] Nayak GHH, Alam MW, Singh KN, et al. (2024) Exogenous variable driven deep learning models for improved price forecasting of TOP crops in india. Sci Rep 14: 17203. https://doi.org/10.1038/s41598-024-68040-3 doi: 10.1038/s41598-024-68040-3
    [31] Putra AW, Supriatna J, Koestoer RH, et al. (2021) Differences in local rice price volatility, climate, and macroeconomic determinants in the indonesian market. Sustainability 13: 4465. https://doi.org/10.3390/su13084465 doi: 10.3390/su13084465
    [32] Reis Filho IJ, Marcacini RM, Rezende SO (2022) On the enrichment of time series with textual data for forecasting agricultural commodity prices. MethodsX 9: 101758. https://doi.org/10.1016/j.mex.2022.101758 doi: 10.1016/j.mex.2022.101758
    [33] Sari M, Duran S, Kutlu H, et al. (2024) Various optimized machine learning techniques to predict agricultural commodity prices. Neural Comput Appl 36: 11439–11459. https://doi.org/10.1007/s00521-024-09679-x doi: 10.1007/s00521-024-09679-x
    [34] Tashman LJ (2000) Out-of-sample tests of forecasting accuracy: An analysis and review. Int J Forecast 16: 437–450. https://doi.org/10.1016/S0169-2070(00)00065-0 doi: 10.1016/S0169-2070(00)00065-0
    [35] Tetlock PC (2007) Giving content to investor sentiment: The role of media in the stock market. J Financ 62: 1139–1168. https://doi.org/10.1111/j.1540-6261.2007.01232.x doi: 10.1111/j.1540-6261.2007.01232.x
    [36] Wang W, Liu Y (2025) A novel framework for agricultural futures price prediction with BERT-based topic identification and sentiment analysis. J Forecast 44: 1969–1992. https://doi.org/10.1002/for.3278 doi: 10.1002/for.3278
    [37] Wang Z, French N, James T, et al. (2023) Climate and environmental data contribute to the prediction of grain commodity prices using deep learning. J Sust Agr Env-Aust 2: 251–265. https://doi.org/10.1002/sae2.12041 doi: 10.1002/sae2.12041
    [38] Xu JL, Hsu YL (2022) The impact of news sentiment indicators on agricultural product prices. Comput Econ 59: 1645–1657. https://doi.org/10.1007/s10614-021-10189-4 doi: 10.1007/s10614-021-10189-4
    [39] Yadav A (2024) A comparative study of time series, machine learning, and deep learning models for forecasting global price of wheat. SN Operations Research Forum 5: 113. https://doi.org/10.1007/s43069-024-00395-9 doi: 10.1007/s43069-024-00395-9
  • DSFE-06-02-009-s001.pdf
  • Reader Comments
  • © 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(22) PDF downloads(6) Cited by(0)

Article outline

Figures and Tables

Figures(4)  /  Tables(6)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog