%0 Journal Article %@ 2369-3762 %I JMIR Publications %V 11 %N %P e72034 %T Assessment of Large Language Model Performance on Medical School Essay-Style Concept Appraisal Questions: Exploratory Study %A Mehta,Seysha %A Haddad,Eliot N %A Burke,Indira Bhavsar %A Majors,Alana K %A Maeda,Rie %A Burke,Sean M %A Deshpande,Abhishek %A Nowacki,Amy S %A Lindenmeyer,Christina C %A Mehta,Neil %K essay-type questions %K large language models %K generative AI %K Microsoft Copilot %K artificial intelligence %D 2025 %7 16.6.2025 %9 %J JMIR Med Educ %G English %X Bing Chat (subsequently renamed Microsoft Copilot)—a ChatGPT 4.0–based large language model—demonstrated comparable performance to medical students in answering essay-style concept appraisals, while assessors struggled to differentiate artificial intelligence (AI) responses from human responses. These results highlight the need to prepare students and educators for a future world of AI by fostering reflective learning practices and critical thinking. %R 10.2196/72034 %U https://mededu.jmir.org/2025/1/e72034 %U https://doi.org/10.2196/72034