Published on in Vol 9 (2023)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/48305, first published .
Variability in Large Language Models’ Responses to Medical Licensing and Certification Examinations. Comment on “How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment”

Variability in Large Language Models’ Responses to Medical Licensing and Certification Examinations. Comment on “How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment”

Variability in Large Language Models’ Responses to Medical Licensing and Certification Examinations. Comment on “How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment”

Authors of this article:

Richard H Epstein1 Author Orcid Image ;   Franklin Dexter2 Author Orcid Image

Journals

  1. Velásquez-Henao J, Franco-Cardona C, Cadavid-Higuita L. Prompt Engineering: a methodology for optimizing interactions with AI-Language Models in the field of engineering. DYNA 2023;90(230):9 View
  2. Gilson A, Safranek C, Huang T, Socrates V, Chi L, Taylor R, Chartash D. Authors’ Reply to: Variability in Large Language Models’ Responses to Medical Licensing and Certification Examinations. JMIR Medical Education 2023;9:e50336 View
  3. Guillen-Grima F, Guillen-Aguinaga S, Guillen-Aguinaga L, Alas-Brun R, Onambele L, Ortega W, Montejo R, Aguinaga-Ontoso E, Barach P, Aguinaga-Ontoso I. Evaluating the Efficacy of ChatGPT in Navigating the Spanish Medical Residency Entrance Examination (MIR): Promising Horizons for AI in Clinical Medicine. Clinics and Practice 2023;13(6):1460 View
  4. Meyer A, Riese J, Streichert T. Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study. JMIR Medical Education 2024;10:e50965 View
  5. Gordon M, Daniel M, Ajiboye A, Uraiby H, Xu N, Bartlett R, Hanson J, Haas M, Spadafore M, Grafton-Clarke C, Gasiea R, Michie C, Corral J, Kwan B, Dolmans D, Thammasitboon S. A scoping review of artificial intelligence in medical education: BEME Guide No. 84. Medical Teacher 2024;46(4):446 View
  6. Duggan R, Tsuruda K. ChatGPT performance on radiation technologist and therapist entry to practice exams. Journal of Medical Imaging and Radiation Sciences 2024;55(4):101426 View
  7. Pohl N, Derector E, Rivlin M, Bachoura A, Tosti R, Kachooei A, Beredjiklian P, Fletcher D. A quality and readability comparison of artificial intelligence and popular health website education materials for common hand surgery procedures. Hand Surgery and Rehabilitation 2024;43(3):101723 View
  8. Niset A, El Hadwe S, Englebert A, Barrit S. AI in emergency medicine: Building literacy or castles in the air. The American Journal of Emergency Medicine 2025;87:145 View
  9. Chen C, Bilolikar V, VanNest D, Raphael J, Shaffer G. Artificial intelligence in orthopaedic education: A comparative analysis of ChatGPT and Bing AI's Orthopaedic In‐Training Examination performance. Medicine Advances 2024;2(3):284 View
  10. Aster A, Laupichler M, Rockwell-Kollmann T, Masala G, Bala E, Raupach T. ChatGPT and Other Large Language Models in Medical Education — Scoping Literature Review. Medical Science Educator 2024 View
  11. Kanzawa J, Kurokawa R, Kaiume M, Nakamura Y, Kurokawa M, Sonoda Y, Gonoi W, Abe O. Evaluating the Role of GPT-4 and GPT-4o in the Detectability of Chest Radiography Reports Requiring Further Assessment. Cureus 2024 View

Books/Policy Documents

  1. Burbano G. D, Ibarra C. J. Telematics and Computing. View