Published on in Vol 10 (2024)
Preprints (earlier versions) of this paper are
available at
https://preprints.jmir.org/preprint/55048, first published
.

Journals
- Liu M, Okuhara T, Chang X, Shirabe R, Nishiie Y, Okada H, Kiuchi T. Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis. Journal of Medical Internet Research 2024;26:e60807 View
- Tong W, Zhang X, Zeng H, Pan J, Gong C, Zhang H. Reforming China’s Secondary Vocational Medical Education: Adapting to the Challenges and Opportunities of the AI Era. JMIR Medical Education 2024;10:e48594 View
- Kipp M. From GPT-3.5 to GPT-4.o: A Leap in AI’s Medical Exam Performance. Information 2024;15(9):543 View
- LEVENTOGLU E, SORAN M. Clinical Characteristics of Children with Acute Post-Streptococcal Glomerulonephritis and Re-Evaluation of Patients with Artificial Intelligence. Medeniyet Medical Journal 2024 View
- Nakaura T, Yoshida N, Kobayashi N, Nagayama Y, Uetani H, Kidoh M, Oda S, Funama Y, Hirai T. Performance of Multimodal Large Language Models in Japanese Diagnostic Radiology Board Examinations (2021-2023). Academic Radiology 2025;32(5):2394 View
- Kim J, Vajravelu B. Assessing the Current Limitations of Large Language Models in Advancing Health Care Education. JMIR Formative Research 2025;9:e51319 View
- Qiu Y, Liu C. Capable exam-taker and question-generator: the dual role of generative AI in medical education assessment. Global Medical Education 2025 View
- Nguyen H, Dang H, Nguyen T, Hoang V, Nguyen V, Wu J. Accuracy of latest large language models in answering multiple choice questions in dentistry: A comparative study. PLOS ONE 2025;20(1):e0317423 View
- Zhao Q, Wang H, Wang R, Cao H. Deriving insights from enhanced accuracy: Leveraging prompt engineering in custom GPT for assessing Chinese Nursing Licensing Exam. Nurse Education in Practice 2025;84:104284 View
- Wang J, Shue K, Liu L, Hu G. Preliminary evaluation of ChatGPT model iterations in emergency department diagnostics. Scientific Reports 2025;15(1) View
- Rodrigues Alessi M, Gomes H, Oliveira G, Lopes de Castro M, Grenteski F, Miyashiro L, do Valle C, Tozzini Tavares da Silva L, Okamoto C. Comparative Performance of Medical Students, ChatGPT-3.5 and ChatGPT-4.0 in Answering Questions From a Brazilian National Medical Exam: Cross-Sectional Questionnaire Study. JMIR AI 2025;4:e66552 View
- Altermatt F, Neyem A, Sumonte N, Mendoza M, Villagran I, Lacassie H. Performance of single-agent and multi-agent language models in Spanish language medical competency exams. BMC Medical Education 2025;25(1) View
- Wang L, Li J, Zhuang B, Huang S, Fang M, Wang C, Li W, Zhang M, Gong S. Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis. Journal of Medical Internet Research 2025;27:e64486 View
- Park C, An M, Hwang G, Park R, An J. Clinical Performance and Communication Skills of ChatGPT Versus Physicians in Emergency Medicine: Simulated Patient Study. JMIR Medical Informatics 2025;13:e68409 View
- Wu H, Zerner T, Lee D, Court-Kowalski S, Devitt P, Palmer E. GPT-4 versus human authors in clinically complex MCQ creation: A blinded analysis of item quality. Medical Teacher 2025:1 View
- Cheng Y, Zhu L. A review of ChatGPT in medical education: exploring advantages and limitations. International Journal of Surgery 2025;111(7):4586 View
- Boral K, Mondal K. Comparing AI-Generated Responses: A Study on ChatGPT, Gemini, and Copilot in Education. Journal of Educational Technology Systems 2025;54(2):291 View
- Nakaura T, Uetani H, Yoshida N, Kobayashi N, Nagayama Y, Kidoh M, Kuroda J, Mukasa A, Hirai T. Intra-axial primary brain tumor differentiation: comparing large language models on structured MRI reports vs. radiologists on images. European Radiology 2025 View
- Saowaprut P, Wabina R, Yang J, Siriwat L. Performance of large language models on Thailand’s national medical licensing examination: a cross-sectional study. Journal of Educational Evaluation for Health Professions 2025;22:16 View
- Latkowska A, Sawina P, Dolata T, Boczkowski D, Wielochowska A, Kowalczyk A, Loson-Kawalec M, Radej D, Jaworski W, Majchrowicz W, Olender M, Adamiak J, Sroczynska J, Suleiman R, Glinska J, Szczerbanowicz P, Dadynska P. Assessment of the Ability of the ChatGPT-5 Model to Pass the Endocrinology Specialization Exam. Cureus 2025 View
- Kasagga A, Sapkota A, Changaramkumarath G, Abucha J, Wollel M, Somannagari N, Husami M, Hailu K, Kasagga E. Performance of ChatGPT and Large Language Models on Medical Licensing Exams Worldwide: A Systematic Review and Network Meta-Analysis With Meta-Regression. Cureus 2025 View
Books/Policy Documents
- Pérez G, Gamboa A. The Second International Symposium on Generative AI and Education (ISGAIE’2025). View
