1.
Introduction
Artificial intelligence (AI) has emerged as a game-changing technology in various fields, including healthcare. Initially, it was defined as the exhibition of human intelligence by machines, but in contemporary days, AI is defined as a set of technologies that are capable of analysing large sets of data, experiential learning, and the development of models [13]. The application of AI in nursing education has the potential to transform existing teaching approaches and improve future nurses' learning experiences. Nursing education is crucial for students to provide safe, competent, and compassionate care in a changing healthcare landscape. Conventional methods, including theory and practice, are the core components of nursing education. However, theory and practice gaps among nursing students have been a significant and long-standing concern for nurse educators and healthcare institutions worldwide for decades. Lifshits and Rosenberg [18], through their review of the literature, identified the potential of AI to enhance learning outcomes among nursing students. Nonetheless, they did not quantify the strength of this effect as measuring effectiveness was beyond the scope of their study. The ambiguity in the context of AI use is preventing nursing educators from utilising it to the optimum level and depriving nursing students of novel learning opportunities. This issue, if not resolved, will result in an irreversible or negative consequence on the quality, threatening patient safety. Our purpose of this research is to address the ambiguity by evaluating the strength of the effect on nursing education.
2.
Background
Conventional teaching strategies have been criticised for emphasising memorisation over critical thinking and fostering passive learning. Despite initiatives to address this issue, the gap between theory and practice persists. According to studies, traditional approaches inhibit critical thinking, a crucial skill when interacting with patients [30]. Intelligent Tutoring Systems (ITS), Virtual Reality (VR), and Augmented Reality (AR), as well as emerging Chatbot and virtual assistant applications, demonstrate enormous possibilities, although the effectiveness of the aforementioned applications in nursing education is under investigation. Studies have posited that integrating AI in nursing education has improved education outcomes in terms of knowledge development, student satisfaction, and confidence [2], claiming that incorporating AI into the current nursing curriculum is among the most significant developments in higher nursing education [10]. Buchanan et al. [4] revealed that incorporating AI technology into clinical simulations has the potential to enhance nursing students' critical thinking and preparedness for real-world scenarios. Similarly, De Gagne [10] and Sheela [27] believed that advanced technology fulfils the need for active and immersive learning experiences. Romli et al. [24] indicated that integrating online learning into clinical practice improved nursing students' satisfaction.
While the aforementioned research recognized the role of AI in improving learning outcomes, Lahti et al. [16] found no differences between traditional and modern approaches but acknowledged that it could be an alternative pedagogy. The contrasting findings from the aforementioned research create an ambiguity about the effects of AI, which can limit its implementation, leading to a gap between theoretical understanding of AI use and its practical use. Hence, robust academic study and scholarly research into their effectiveness and influence in nursing education are necessary to maximise their full potential.
3.
Aim and objectives
In this systematic review on RCTs, we comprehensively evaluated the literature on the effectiveness of applying artificial intelligence in nursing education, with the following specific objective:
1. To evaluate the effectiveness of AI-based technology applications on nursing students' learning outcomes in terms of knowledge level.
PICO Question- In nursing students (P), how do AI-powered Intelligent Tutoring Systems (ITS) (I) compare to traditional classroom teaching methods (C) in improving knowledge level (O)?
3.1. Review protocol
The meta-analysis included the calculation of effect size, the development of a random effects model, and the forest and funnel plots. The effect size calculation was done to standardise the results associated with knowledge development among nursing students due to the use of AI-based techniques. The pooled effect size was necessary for determining the strength of the effects. The forest plot provided a visual representation of the effects in different studies along with the overall effect. The funnel plot is important for identifying publication bias and heterogeneity in the studies. It is followed by a systematic review protocol that adheres to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement [23] as a systematic guideline. The research question adheres to the PICO model:
P: Pre- or post-registration [undergraduate] nursing students
I: AI-based education applications
C: Comparison with other non-AI or traditional nursing education interventions
O: Effectiveness of AI on knowledge development
T: Minimum one 15-minute session of ITS
The (PRISMA) protocol for systematic literature reviews is a set of publication standards that guide reviewers on the information necessary to evaluate the quality and rigour of a review. It emphasises the reporting of systematic reviews of randomised trials, which can be utilised for reviews of different types of research [21]. The PRISMA method ensures a comprehensive review of scientific literature within a predetermined time frame, thereby enabling a precise examination of key terms related to artificial intelligence in nursing education.
3.2. Methods
This research involves a meta-analysis. Meta-analysis can be described as a statistical technique that combines the results of different studies to answer the research question. It also involves systematic synthesis of the evidence, identification of patterns, and deriving generalised solutions. In this study, the meta-analysis is used to increase the sample size, which leads to an improvement in accuracy. Finally, the meta-analysis is useful for resolving the contradictions in the study, providing clear insight into the overall direction of the effect.
3.2.1. Eligibility criteria
The inclusion criteria for this systematic review were RCT studies that entailed the use of AI in nursing education among nursing students, empirical studies that evaluated the effectiveness of AI interventions or technologies, studies conducted for nursing education published in English between 2016 and 2023, and only a literature-type journal. The term "literature-type journal" refers to peer-reviewed journal articles, excluding grey literature such as dissertations, book chapters, editorials, and conference abstracts, which may lack methodological transparency and peer-review rigor.
Postgraduate nursing students, such as master's or doctorate candidates, and any other healthcare professionals working with interprofessional teams were excluded as they do not belong to the undergraduate category and typically have completed their nursing registration. This exclusion was also intended to maintain focus on the foundational stage of nursing education, which may differ significantly in learning outcomes, scope, and curriculum design compared to postgraduate or interdisciplinary settings. Studies that could have established that they were examining AI interventions, comparing them to traditional methods, or specifying any criteria carefully were omitted. In addition, studies involving mixed professional groups were excluded to prevent confounding variables and to ensure that the findings reflect the direct impact of AI interventions on nursing students. Non-RCTs and restricted access were also excluded from the analysis. All literature must pass the CASP Randomised Controlled Trial Screening to ensure quality.
RCTs were explicitly selected in this review due to their rigorous study design, ability to demonstrate the causal relationship and variable control, and because they provided strong evidence or significant weight to the results. Systematic reviews entail primary findings and provide more evidence than other studies [5].
3.2.2. Search strategy
To identify, screen, and include literature, we performed independent electronic database searches in Scopus, Pubmed, CINAHL, and Google Scholar databases. Each reviewer searched all databases using the same agreed search terms to ensure comprehensive and consistent coverage.
Most related RCTs on AI's effectiveness in nursing education were searched and examined for eligibility. We used identical keywords, MeSH words, and their synonyms when searching via titles or abstracts, applying Boolean Operators (and or) to enhance search precision. Although the searches were performed separately, we applied the same search strategy to minimise bias and ensure no relevant studies were missed. There was no division of labor between the reviewers, meaning each was responsible for searching all databases using the full set of search terms.
The database filter limited studies from 2016–2023, considering this was relevant for newly emerged technology. Table 1 illustrates the search strategy utilised to complete the study searches. The search started on January 15, 2023 and ended on May 31, 2023 as adequate studies on the topic were identified.
Artificial intelligence [Title/Abstract] OR AI [Title/Abstract] OR AI interventions [Title/Abstract]
OR Artificial Intelligence applications [Title/Abstract]
AND
Nursing education [Title/Abstract] OR Education of Nursing [Title/Abstract] OR Nurse education [Title/Abstract] OR Effectiveness in nurse education [Title/Abstract].
We applied a similar strategy to conduct electronic searches in additional databases. Further, we screened the reference lists to identify more studies that were viable for the review. Only English articles were included in the research.
3.2.3. Selection process
We independently identified journals pertinent to the review and eliminated duplicate data. The data were screened based on the titles, abstracts, and full text by database sorting or hand searching [21,26]. During the screening stage, titles and abstracts were evaluated based on predefined criteria, including the presence of AI-based interventions, target population (nursing students), outcome focus (knowledge development), and RCT design. Keywords such as "AI, " "Artificial Intelligence, " "Intelligent Tutoring System, " "nursing education", and "RCT" were prioritised during the initial screening process.
Eligible and fully accessed journals were downloaded independently by us, and were then evaluated and appraised together against quality using the CASP RCT Checklist [9] (Table 2). Kraus et al. [15] emphasised the importance of addressing and resolving inconsistencies through discussion, which we fully acknowledged. A third reviewer or an expert in meta-analysis and systematic review methodology was consulted to obtain a consensus in the case of disagreements related to the inclusion of a study. However, no discrepancies emerged in this review, indicating a high level of agreement between us. A total of five studies were included and reviewed. Figure 1 shows the PRISMA 2020 flow diagram.
3.2.4. Appraisal procedure
The CASP RCT Checklist [9] was utilised to appraise and evaluate a study's quality to ensure its robustness. A total of (n = 6) full-text studies were downloaded, of which (n = 1) failed eligibility after further screening. Hence, (n = 5) final trials were examined against eleven questions assessing critical elements of randomisation, blinding, sample size calculation, data analysis, and results reporting.
A modified CASP scoring system was adopted, where a score of > 6 indicated high quality, 4–6 indicated moderate quality, and < 4 indicated low quality. This scoring was based on the number of "yes" responses to the CASP questions. As no scoring was suggested, more "yes" than "no" or "can't tell" responses implied a higher quality study and could be included in the review [15]. If a study received more than three "can't tell" responses in critical domains, it was flagged for further discussion.
Discrepancies or equal responses required discussion with another expert, who was an expert in meta-analysis and systematic review methodology, and would have made the final decision in the event of unresolved disagreement; however, this was not utilised as there were no discrepancies. All (n = 5) trials in this review passed the quality criteria outlined in the CASP RCT checklist [9]. Table 3 presents a detailed breakdown of the quality appraisal, where each article was evaluated based on the number of 'yes' responses. The articles with a score of 9 or above were considered high quality and included.
3.2.5. Data extraction
Each eligible study's data was extracted, and the following details were tabulated: The names of the authors, the year the study was published, the country of origin, the number of participants in the intervention and control groups, and the research outcomes.
4.
Results
4.1. Characteristics of included studies
Five RCTs published between 2016 and 2023 are included in this review (Table 4).
4.1.1. Population
A total of 360 participants were recruited in the five studies; the sample size in each study ranged from a minimum of 28 to a maximum of 107 undergraduate nursing students. There were Year 1 (n = 1), Year 2 (n = 1), Year 3 (n = 1), senior nursing students (n = 1), and nursing assistant students (n = 1). The countries of origin of the included studies were Canada, China, Korea, Taiwan, and Turkey.
4.1.2. Intervention
Different types of AI-based education were applied across RCTs within countries. Two virtual screenings, one interactive nursing mobile app, one immersive virtual reality, and an AI-assisted screen-based simulation were applied in the studies.
4.1.3. Comparison
All the studies included AI applications in teaching and learning, comparing their effects on the learning outcomes with the control groups. The control groups applied a non-AI approach, including face-to-face simulation, traditional approach, non-interactive nursing skills video, 2D video, and standard simulation.
4.1.4. Outcomes
Improved and no difference in knowledge, high cognitive load, improved skills and not as effective as standard practice, increased anxiety level, increased and no difference in self-confidence or self-efficacy, high satisfaction, self-reported traditional preference, and improved learning motivation were among the study outcomes and were reported accordingly by each trial.
4.1.5. Type
All studies entailed Randomized Control Trials.
4.1.6. Meta-analysis results
Random-Effects Model (k = 5; tau^2 estimator: REML)
Logik deviance AIC BIC AICc
-4.6307 9.2615 13.2615 12.0341 25.2615
tau^2 (estimated amount of total heterogeneity): 0.5401 (SE = 0.4242)
tau (square root of estimated tau^2 value): 0.7349
I^2 (total heterogeneity / total variability): 90.53%
H^2 (total variability/sampling variability): 10.56
Test for Heterogeneity:
Q(df = 4) = 45.0426, p-val < .0001
Model Results:
estimate se zval pval ci.lb ci.ub
0.2431 0.3464 0.7019 0.4827 -0.4358 0.9220
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As evident from the meta-analysis results and forest plot, Simsek-Cetinkaya and Cakir [29] had a Standardised Mean Difference (SMD) of -1. 00, indicating a negative effect of using AI on nursing students' knowledge. In contrast, the other studies suggested a small, positive effect of using AI on the knowledge of nursing students. Gu et al. [12] indicated the presence of small, positive, and statistically significant effects of AI on the knowledge of nursing students. The random effects model (SMD = 0.2431) indicated a small, positive, and statistically insignificant (CI: [-0.4358, 0.9220], p-value: 0.4827) pooled effect of using AI on the knowledge of nursing students.
The studies had high heterogeneity (I2 = 90.53%), which indicated that the variation in the results was caused by real differences than random chance. The heterogeneity was statistically significant, which was also evident from Q(4) = 45.04, p < 0.0001. Tau2 was 0.501, which also indicated substantial variants between the effects across studies.
The asymmetry in the funnel plot also indicated high heterogeneity between the studies. This indicated that the relationship between AI use and nursing education was complex and needs to be investigated further.
5.
Findings
Three major themes were formulated based on the analysis of a total number of (n = 5) trials included in this review. As presented in Table 4, the themes were cognitive (2 sub-themes), psychomotor (1 sub-theme), and affective (3 sub-themes). Of 5 trials, 4 involved the effects of AI intervention on cognition, 3 involved the effects on psychomotor, and 4 involved the effects of AI application on emotions (Table 5). Cognitive was further divided into knowledge level and cognitive load; psychomotor consisted of skills and performance, and emotions comprised anxiety, satisfaction, and confidence.
5.1. Cognitive
5.1.1. Knowledge levels
Of 5, most trials (n = 3) showed that AI improved knowledge, with one trial revealing no difference in implementing AI compared to traditional intervention. Aligned with Buchanan et al. [4], they suggested that AI technology in clinical simulations improved nursing students' knowledge, critical thinking, and readiness for real-life circumstances. De Gagne [10] and Sheela [27] agreed that sophisticated technology meets the demand for active and immersive learning experiences. They also highlighted the effectiveness of active learning by enhancing the learning outcome through the level of knowledge, boosting the confidence level, and resulting in a higher quality of care. This shows the significant effectiveness of AI-based technology in teaching and learning.
On the other hand, Corbett and Clarke [8] rejected that AI improved knowledge. They affirmed that it makes no difference from the traditional approaches. However, they emphasised that it is more convenient in terms of time and location, which would result in cost savings if implemented. Therefore, this finding encourages transitioning to telehealth, where hybrid education can be delivered to more individuals, especially in unprivileged areas. In addition, this would accommodate the limited resources in terms of faculty scarcity to admit more students into the profession, thereby preventing the global shortage of nurses and nursing academicians.
5.1.2. Cognitive load
Of 5 trials, only 1 trial highlighted high cognitive load using a 5-point Likert scale (1 = "strongly disagree" to 5 = "strongly agree"). Researchers have agreed that it increases cognitive learning due to unrelated complex visuals, images, details and interactions [20]. Frederiksen et al. [11] supported this result and argued that it improves learning outcomes. Looking at a different perspective, Bal and Bicen [1] stated that applying technology to collaborative learning can efficiently reduce cognitive burden while maintaining cognitive challenge. It gives students the knowledge and skills to boost their confidence and self-esteem in real-life circumstances.
Apart from that, the increased cognitive load could be due to the unfamiliarity with the new AI interface. Hence, more time allocation and adequate training must be given before application. Brookhart [3] agreed that learning needs time to acquire and integrate knowledge before it can be effectively applied. Knowledge of the resulting consequences and advantages enables informed decisions regarding the future design of the AI-based tool.
5.2. Psychomotor
Of 5 trials, 3 entailed skill performance. AI improved skills in 2 trials, while it was ineffective in the third. Similar to DeGagne [10], AI-based technology enables realistic simulation and brings students close to the real-world environment, particularly when students lack the opportunity to practise. Furthermore, it enables the trainee to practise repeatedly without compromising patient safety [27]. This result was supported by Saifan et al. [25] in a different study. Therefore, AI ensures nursing quality while providing an engaging and immersive experience. It enables repetitive practice without fear of making life-threatening mistakes; hence, incorporating advanced technology alleviates students' stress and increases their self-efficacy.
On the contrary, in another trial, AI was reported to be ineffective. In this circumstance, the language barrier was discussed. Thus, step-by-step instructions on practical skill training should be fully developed, and external barriers such as language should be addressed to capitalise on the full value of the implementation. Further study should be conducted to explore the perceptions on the challenges and preparedness of integrating AI into nursing education to understand the experiences and gain new insights about AI adoption.
5.3. Emotions
5.3.1. Anxiety
Of 5 trials, only 2 entailed anxiety levels, and both reported increased anxiety in AI intervention. Consequently, the stress hormone cortisol can overwhelm the hippocampus, impairing its ability to learn, remember, or retrieve stored knowledge [6]. This is similar to a study conducted by Cheung and Au [7], who discovered that underperformance can be attributed to stress and anxiety. Besides, it is widely acknowledged that students face similar challenges when transitioning from the classroom to actual clinical settings. Further, this result can be influenced by other factors, including a lack of preparation, unfamiliarity with the new surroundings, and uncertainty of what to expect, especially with a newly emerging technology. Therefore, it is essential to incorporate anxiety-reducing measures into implementing AI-based technology, particularly in clinical practice, to help students cope with this attribute, enhancing their learning process. However, the few studies assessing these outcomes may not provide a comprehensive picture of how AI interventions affect emotional and psychological states.
5.3.2. Satisfaction
Only 2 of 5 trials entailed satisfaction, both showing excellent results. This was consistent with the findings of Padilha et al. [22]. Chao et al. (2021) mentioned that students preferred VR learning because it was unique and enjoyable. Aligned with Cant and Cooper's (2014) study, this indicated acceptance of the emerging pedagogy, suggesting a potential shift toward advanced technology. Nevertheless, Cobbett and Snelgrove-Clarke [8] discovered that nursing students opted for traditional approaches due to difficulties such as technological constraints. This emphasises the need to explore further issues, including solving connectivity and tool-related issues, to ensure students take full advantage of their opportunities. Despite reporting technological challenges, they support students' enjoyment of modern technology, strengthening AI's potential.
5.3.3. Confidence
Of 5 trials, 3 entailed the confidence level. Two of 3 exhibited high confidence, while Cobbett and Snelgrove-Clarke's [8] study indicated no difference and preferred traditional intervention. AI technologies, such as personalised training, engaging experiences, and immediate feedback, reliably improve students' confidence and knowledge [27]. This benefits students as it indirectly creates a learning context that enables them to take risks and avoid the fear of humiliation or shame. However, the impact on confidence may vary due to different learning styles, while other AI tools like ChatGPT and virtual assistants are reported to foster self-directed learning [24]. Lee and Han [17] validated AI's potential by discovering a beneficial effect on personal efficacy, clinical reasoning competence, and student learning satisfaction. Moreover, prompt feedback strengthens comprehension, directly improving students' knowledge and skill outcomes. Consequently, it improves safety, fostering the best possible healthcare. Thus, enhancing student and patient satisfaction improves healthcare quality [28].
5.4. Implications
In this review, we highlight the effectiveness of the AI technology application in nursing education, acknowledging its possible challenges that will hinder the implementation. Further insights and studies addressing the issue are recommended to investigate and explore the effectiveness of AI, especially in nursing education. As the healthcare landscape is rapidly evolving, the preparedness of the faculty and students must be ensured to reap its full benefits. Advancement of AI is inevitable in this digital era, so urgent measures to meet the industry's contemporary demand are urged. Here, we guide the nurse educators to select the best pedagogy approach to produce high-quality graduates to achieve educational objectives.
6.
Limitations and strengths
To the best of our knowledge, this is the first meta-analysis to involve RCTs to evaluate the effectiveness of AI-based approaches in nursing education. Despite the limited number of included literature and restricted access to several studies due to the possibility of publication bias and language limitations, caution is advised. It is essential to acknowledge the limitations and interpret the results with discretion. Moreover, there is a need for additional research to comprehensively evaluate AI in nursing education and resolve nursing issues, particularly in a randomised controlled study, for a solid evidence-based foundation and generalizability of the results.
7.
Conclusions
For the meta-analysis, we use randomised controlled trials to examine the effectiveness of integrating AI into nursing education. Incorporating AI into nursing education is exceptionally pertinent and timely, given its growing impact on the healthcare and nursing education industry. The findings highlight the educational benefits of AI adoption and provide insightful knowledge for nurse educators, policymakers, and researchers. Adopting AI in nursing education prepares future nurses for the digital age in meeting the industry's demand. The advancement of AI implementation knowledge requires the collaboration and support of educators, policymakers, and stakeholders. Moreover, nursing education can be enhanced using AI-based solutions, optimising outcomes and fostering pedagogical innovation, thereby resolving a long-standing theory-gap issue. Due to the large variation between the studies, it can be argued that the single pooled effect is not sufficient to uncover one of the effects of AI in nursing education. Further analysis on the moderators and subgroups can be required for a clearer understanding.
Author contributions
Jandy and Shalin jointly designed the study, conducted the literature search, collected and organized the data, provided research materials and performed data analysis and interpretation. The reviewers independently identified relevant journals, eliminated duplicate data, and downloaded eligible journals. Both authors collaborated on enhancing study quality and resolved any discrepancies through discussion, without requiring consultation with a third reviewer. Jandy authored the initial and final draft of article and provided logistic support. Both authors have critically reviewed and approved the final draft and take full responsibility for the manuscript's content and publication.
Use of Generative-AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Conflict of interest
The authors declare there is no conflict of interest in this paper.
Ethics declaration
The author declared that no ethics approval is required for the study.