Published on in Vol 10 (2024)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/52784, first published .
Influence of Model Evolution and System Roles on ChatGPT’s Performance in Chinese Medical Licensing Exams: Comparative Study

Influence of Model Evolution and System Roles on ChatGPT’s Performance in Chinese Medical Licensing Exams: Comparative Study

Influence of Model Evolution and System Roles on ChatGPT’s Performance in Chinese Medical Licensing Exams: Comparative Study

Journals

  1. Kipp M. From GPT-3.5 to GPT-4.o: A Leap in AI’s Medical Exam Performance. Information 2024;15(9):543 View
  2. Brin D, Sorin V, Konen E, Nadkarni G, Glicksberg B, Klang E. How GPT models perform on the United States medical licensing examination: a systematic review. Discover Applied Sciences 2024;6(10) View
  3. Wang L, Li J, Zhuang B, Huang S, Fang M, Wang C, Li W, Zhang M, Gong S. Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis. Journal of Medical Internet Research 2025;27:e64486 View
  4. Wang W, Fu J, Zhang Y, Hu K. A Comparative Analysis of GPT-4o and ERNIE Bot in a Chinese Radiation Oncology Exam. Journal of Cancer Education 2025 View
  5. Wu J, Wang Z, Qin Y. Performance of DeepSeek-R1 and ChatGPT-4o on the Chinese National Medical Licensing Examination: A Comparative Study. Journal of Medical Systems 2025;49(1) View
  6. Cheng Y, Zhu L. A review of ChatGPT in medical education: exploring advantages and limitations. International Journal of Surgery 2025;111(7):4586 View
  7. Schwarzkopf S, Bereuter J, Geissler M, Weitz J, Distler M, Kolbinger F, Berens P. Postoperative complication management: How do large language models measure up to human expertise?. PLOS Digital Health 2025;4(8):e0000933 View
  8. Zhang Y, Xie X, Xu Q. ChatGPT in Medical Education: Bibliometric and Visual Analysis. JMIR Medical Education 2025;11:e72356 View
  9. Ming S, Yao X, Guo Q, Chen D, Guo X, Xie K, Lei B. Evaluation of DeepSeek-R1 for Ophthalmic Diagnosis and Reasoning: A Comparison with OpenAI o1 and o3. Journal of Medical Systems 2025;49(1) View
  10. Kasagga A, Sapkota A, Changaramkumarath G, Abucha J, Wollel M, Somannagari N, Husami M, Hailu K, Kasagga E. Performance of ChatGPT and Large Language Models on Medical Licensing Exams Worldwide: A Systematic Review and Network Meta-Analysis With Meta-Regression. Cureus 2025 View
  11. Inojosa H, Ramezanzadeh A, Gasparovic-Curtini I, Wiest I, Kather J, Gilbert S, Ziemssen T. Education Research: Can Large Language Models Match MS Specialist Training?. Neurology Education 2025;4(4) View