Research article

Large language model enabled synthetic dataset generation for human-AI teaming in mental health assessment

  • Received: 13 March 2025 Revised: 16 July 2025 Accepted: 17 July 2025 Published: 22 July 2025
  • Mental health assessment presents unique challenges in healthcare due to its inherently subjective nature and the scarcity of high-quality training data. This research explores the use of large language models (LLMs) to generate synthetic datasets for Human-AI teaming algorithms, focusing on mental health assessments. We created a diverse dataset that simulates human-AI collaboration scenarios in diagnostic processes. Synthetic data are labeled through an innovative approach that involves two human annotators and three LLMs, using majority voting for consensus-based annotations. The dataset initially achieves a similarity score of 83.1%, which is further improved by considering human factors that influence decision-making, with varying performance across different mental health categories, highlighting the need for targeted improvements in data collection and model architecture. Our approach addresses several key challenges in the field, including the lack of real-world training data, privacy concerns, and the need for diverse training datasets. This study serves as a foundation for future work in this critical area, which could lead to more effective and ethically sound AI-assisted mental health assessment tools.

    Citation: Sai Sanjay Potluri, Md Abdullah Al Hafiz Khan, Yong Pei. Large language model enabled synthetic dataset generation forhuman-AI teaming in mental health assessment[J]. Applied Computing and Intelligence, 2025, 5(2): 127-153. doi: 10.3934/aci.2025009

    Related Papers:

  • Mental health assessment presents unique challenges in healthcare due to its inherently subjective nature and the scarcity of high-quality training data. This research explores the use of large language models (LLMs) to generate synthetic datasets for Human-AI teaming algorithms, focusing on mental health assessments. We created a diverse dataset that simulates human-AI collaboration scenarios in diagnostic processes. Synthetic data are labeled through an innovative approach that involves two human annotators and three LLMs, using majority voting for consensus-based annotations. The dataset initially achieves a similarity score of 83.1%, which is further improved by considering human factors that influence decision-making, with varying performance across different mental health categories, highlighting the need for targeted improvements in data collection and model architecture. Our approach addresses several key challenges in the field, including the lack of real-world training data, privacy concerns, and the need for diverse training datasets. This study serves as a foundation for future work in this critical area, which could lead to more effective and ethically sound AI-assisted mental health assessment tools.



    加载中


    [1] R. AlMakinah, A. Norcini-Pala, L. Disney, M. Abdullah Canbaz, Enhancing mental health support through human-ai collaboration: Toward secure and empathetic ai-enabled chatbots, Proceedings of IEEE Conference on Artificial Intelligence (CAI), 2025,196–202. https://doi.org/10.1109/CAI64502.2025.00038
    [2] A. Esteva, K. Chou, S. Yeung, N. Naik, A. Madani, A. Mottaghi, et al., Deep learning-enabled medical computer vision, NPJ Digit. Med., 4 (2021), 5. https://doi.org/10.1038/s41746-020-00376-2 doi: 10.1038/s41746-020-00376-2
    [3] H. Ghanadian, I. Nejadgholi, H. Al Osman, Socially aware synthetic data generation for suicidal ideation detection using large language models, IEEE Access, 12 (2024), 14350–14363. https://doi.org/10.1109/ACCESS.2024.3358206 doi: 10.1109/ACCESS.2024.3358206
    [4] A. Kang, J. Chen, Z. Lee-Youngzie, S. Fu, Synthetic data generation with llm for improved depression prediction, arXiv: 2411.17672. https://doi.org/10.48550/arXiv.2411.17672
    [5] A. Khetan, Z. Lipton, A. Anandkumar, Learning from noisy singly-labeled data, arXiv: 1712.04577. https://doi.org/10.48550/arXiv.1712.04577
    [6] N. Martinez-Martin, T. Insel, P. Dagum, H. Greely, M. Cho, Data mining for health: staking out the ethical territory of digital phenotyping, NPJ Digit. Med., 1 (2018), 68. https://doi.org/10.1038/s41746-018-0075-8 doi: 10.1038/s41746-018-0075-8
    [7] T. Pitkämäki, T. Pahikkala, I. Perez, P. Movahedi, V. Nieminen, T. Southerington, et al., Finnish perspective on using synthetic health data to protect privacy: the PRIVASA project, Applied Computing and Intelligence, 4 (2024), 138–163. https://doi.org/10.3934/aci.2024009 doi: 10.3934/aci.2024009
    [8] Prime Psychiatry, The great debate: is ai in mental health better at diagnosing mental illness than humans? Prime Psychiatry Office, 2024. Available from: https://primepsychiatrymd.com/blog/the-great-debate-is-ai-in-mental-health-better-at-diagnosing-mental-illness-than-humans/.
    [9] LLM Radar, Exploring azure AI's Phi-3.5-Mini-Instruct: the compact yet powerful LLM, Tal Peretz, 2024. Available from: https://blog.llmradar.ai/azure-ai-phi-3-5-mini-instruct/.
    [10] M. Rollwage, J. Habicht, K. Juechems, B. Carrington, S. Viswanathan, M. Stylianou, et al., Using conversational AI to facilitate mental health assessments and improve clinical efficiency within psychotherapy services: real-world observational study, JMIR AI, 2 (2023), e44358. https://doi.org/10.2196/44358 doi: 10.2196/44358
    [11] A. Sharma, I. Lin, A. Miner, D. Atkin, T. Althoff, Human-AI collaboration enables more empathic conversations in text-based peer-to-peer mental health support, Nat. Mach. Intell., 5 (2023), 46–57. https://doi.org/10.1038/s42256-022-00593-2 doi: 10.1038/s42256-022-00593-2
    [12] J. Torous, S. Bucci, I. Bell, L. Kessing, M. Faurholt-Jepsen, P. Whelan, et al., The growing field of digital psychiatry: current evidence and the future of apps, social media, chatbots, and virtual reality, World Psychiatry, 20 (2021), 318–335. https://doi.org/10.1002/wps.20883 doi: 10.1002/wps.20883
  • Reader Comments
  • © 2025 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1044) PDF downloads(30) Cited by(0)

Article outline

Figures and Tables

Figures(18)  /  Tables(9)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog