TY - JOUR AU - Montagna, Marco AU - Chiabrando, Filippo AU - De Lorenzo, Rebecca AU - Rovere Querini, Patrizia PY - 2025 DA - 2025/3/18 TI - Impact of Clinical Decision Support Systems on Medical Students’ Case-Solving Performance: Comparison Study with a Focus Group JO - JMIR Med Educ SP - e55709 VL - 11 KW - chatGPT KW - chatbot KW - machine learning KW - ML KW - artificial intelligence KW - AI KW - algorithm KW - predictive model KW - predictive analytics KW - predictive system KW - practical model KW - deep learning KW - large language models KW - LLMs KW - medical education KW - medical teaching KW - teaching environment KW - clinical decision support systems KW - CDSS KW - decision support KW - decision support tool KW - clinical decision-making KW - innovative teaching AB - Background: Health care practitioners use clinical decision support systems (CDSS) as an aid in the crucial task of clinical reasoning and decision-making. Traditional CDSS are online repositories (ORs) and clinical practice guidelines (CPG). Recently, large language models (LLMs) such as ChatGPT have emerged as potential alternatives. They have proven to be powerful, innovative tools, yet they are not devoid of worrisome risks. Objective: This study aims to explore how medical students perform in an evaluated clinical case through the use of different CDSS tools. Methods: The authors randomly divided medical students into 3 groups, CPG, n=6 (38%); OR, n=5 (31%); and ChatGPT, n=5 (31%); and assigned each group a different type of CDSS for guidance in answering prespecified questions, assessing how students’ speed and ability at resolving the same clinical case varied accordingly. External reviewers evaluated all answers based on accuracy and completeness metrics (score: 1‐5). The authors analyzed and categorized group scores according to the skill investigated: differential diagnosis, diagnostic workup, and clinical decision-making. Results: Answering time showed a trend for the ChatGPT group to be the fastest. The mean scores for completeness were as follows: CPG 4.0, OR 3.7, and ChatGPT 3.8 (P=.49). The mean scores for accuracy were as follows: CPG 4.0, OR 3.3, and ChatGPT 3.7 (P=.02). Aggregating scores according to the 3 students’ skill domains, trends in differences among the groups emerge more clearly, with the CPG group that performed best in nearly all domains and maintained almost perfect alignment between its completeness and accuracy. Conclusions: This hands-on session provided valuable insights into the potential perks and associated pitfalls of LLMs in medical education and practice. It suggested the critical need to include teachings in medical degree courses on how to properly take advantage of LLMs, as the potential for misuse is evident and real. SN - 2369-3762 UR - https://mededu.jmir.org/2025/1/e55709 UR - https://doi.org/10.2196/55709 DO - 10.2196/55709 ID - info:doi/10.2196/55709 ER -