%0 Journal Article
%@ 2369-3762
%I JMIR Publications
%V 11
%N 
%P e72034
%T Assessment of Large Language Model Performance on Medical School Essay-Style Concept Appraisal Questions: Exploratory Study
%A Mehta,Seysha
%A Haddad,Eliot N
%A Burke,Indira Bhavsar
%A Majors,Alana K
%A Maeda,Rie
%A Burke,Sean M
%A Deshpande,Abhishek
%A Nowacki,Amy S
%A Lindenmeyer,Christina C
%A Mehta,Neil
%K essay-type questions
%K large language models
%K generative AI
%K Microsoft Copilot
%K artificial intelligence
%D 2025
%7 16.6.2025
%9 
%J JMIR Med Educ
%G English
%X Bing Chat (subsequently renamed Microsoft Copilot)—a ChatGPT 4.0–based large language model—demonstrated comparable performance to medical students in answering essay-style concept appraisals, while assessors struggled to differentiate artificial intelligence (AI) responses from human responses. These results highlight the need to prepare students and educators for a future world of AI by fostering reflective learning practices and critical thinking.
%R 10.2196/72034
%U https://mededu.jmir.org/2025/1/e72034
%U https://doi.org/10.2196/72034