Published on in Vol 9 (2023)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/46482, first published .
Artificial Intelligence in Medical Education: Comparative Analysis of ChatGPT, Bing, and Medical Students in Germany

Artificial Intelligence in Medical Education: Comparative Analysis of ChatGPT, Bing, and Medical Students in Germany

Artificial Intelligence in Medical Education: Comparative Analysis of ChatGPT, Bing, and Medical Students in Germany

Journals

  1. Tangadulrat P, Sono S, Tangtrakulwanich B. Using ChatGPT for Clinical Practice and Medical Education: Cross-Sectional Survey of Medical Students’ and Physicians’ Perceptions. JMIR Medical Education 2023;9:e50658 View
  2. Knopp M, Warm E, Weber D, Kelleher M, Kinnear B, Schumacher D, Santen S, Mendonça E, Turner L. AI-Enabled Medical Education: Threads of Change, Promising Futures, and Risky Realities Across Four Potential Future Worlds. JMIR Medical Education 2023;9:e50373 View
  3. Zhang Z, Zhang J, Duan L, Tan C. ChatGPT in dermatology: exploring the limited utility amidst the tech hype. Frontiers in Medicine 2024;10 View
  4. Abdaljaleel M, Barakat M, Alsanafi M, Salim N, Abazid H, Malaeb D, Mohammed A, Hassan B, Wayyes A, Farhan S, Khatib S, Rahal M, Sahban A, Abdelaziz D, Mansour N, AlZayer R, Khalil R, Fekih-Romdhane F, Hallit R, Hallit S, Sallam M. A multinational study on the factors influencing university students’ attitudes and usage of ChatGPT. Scientific Reports 2024;14(1) View
  5. Gordon M, Daniel M, Ajiboye A, Uraiby H, Xu N, Bartlett R, Hanson J, Haas M, Spadafore M, Grafton-Clarke C, Gasiea R, Michie C, Corral J, Kwan B, Dolmans D, Thammasitboon S. A scoping review of artificial intelligence in medical education: BEME Guide No. 84. Medical Teacher 2024;46(4):446 View
  6. Rojas M, Rojas M, Burgess V, Toro-Pérez J, Salehi S. Exploring the Performance of ChatGPT Versions 3.5, 4, and 4 With Vision in the Chilean Medical Licensing Examination: Observational Study. JMIR Medical Education 2024;10:e55048 View
  7. Warrier A, Singh R, Haleem A, Zaki H, Eloy J. The Comparative Diagnostic Capability of Large Language Models in Otolaryngology. The Laryngoscope 2024;134(9):3997 View
  8. Andreychenko A, Gusev A. Perspectives on the application of large language models in healthcare. National Health Care (Russia) 2024;4(4):48 View
  9. Moulaei K, Yadegari A, Baharestani M, Farzanbakhsh S, Sabet B, Reza Afrash M. Generative artificial intelligence in healthcare: A scoping review on benefits, challenges and applications. International Journal of Medical Informatics 2024;188:105474 View
  10. Bharatha A, Ojeh N, Fazle Rabbi A, Campbell M, Krishnamurthy K, Layne-Yarde R, Kumar A, Springer D, Connell K, Majumder M. Comparing the Performance of ChatGPT-4 and Medical Students on MCQs at Varied Levels of Bloom’s Taxonomy. Advances in Medical Education and Practice 2024;Volume 15:393 View
  11. Griewing S, Knitza J, Boekhoff J, Hillen C, Lechner F, Wagner U, Wallwiener M, Kuhn S. Evolution of publicly available large language models for complex decision-making in breast cancer care. Archives of Gynecology and Obstetrics 2024;310(1):537 View
  12. Zengin A, Ulfanov O, Bag Y, Ulas M. Artificial Intelligence Versus Medical Students in General Surgery Exam. Indian Journal of Surgery 2025;87(1):68 View
  13. Liu M, Okuhara T, Chang X, Shirabe R, Nishiie Y, Okada H, Kiuchi T. Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis. Journal of Medical Internet Research 2024;26:e60807 View
  14. Rossettini G, Rodeghiero L, Corradi F, Cook C, Pillastrini P, Turolla A, Castellini G, Chiappinotto S, Gianola S, Palese A. Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study. BMC Medical Education 2024;24(1) View
  15. Patel E, Fleischer L, Filip P, Eggerstedt M, Hutz M, Michaelides E, Batra P, Tajudeen B. Comparative Performance of ChatGPT 3.5 and GPT4 on Rhinology Standardized Board Examination Questions. OTO Open 2024;8(2) View
  16. Li K, Fernandez A, Schwartz R, Rios N, Carlisle M, Amend G, Patel H, Breyer B. Comparing GPT-4 and Human Researchers in Health Care Data Analysis: Qualitative Description Study. Journal of Medical Internet Research 2024;26:e56500 View
  17. Giray L, Aquino R. Use and impact of ChatGPT on undergraduate engineering students: A case from the Philippines. Internet Reference Services Quarterly 2024;28(4):453 View
  18. Aljamaan F, Temsah M, Altamimi I, Al-Eyadhy A, Jamal A, Alhasan K, Mesallam T, Farahat M, Malki K. Reference Hallucination Score for Medical Artificial Intelligence Chatbots: Development and Usability Study. JMIR Medical Informatics 2024;12:e54345 View
  19. Jeong J, Gil D, Kim D, Jeong J. Current Research and Future Directions for Off-Site Construction through LangChain with a Large Language Model. Buildings 2024;14(8):2374 View
  20. Suresh S, Misra S. Large Language Models in Pediatric Education: Current Uses and Future Potential. Pediatrics 2024;154(3) View
  21. Wang Y, Liang L, Li R, Wang Y, Hao C. Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control. Journal of Multidisciplinary Healthcare 2024;Volume 17:3917 View
  22. Alnaim N, AlSanad D, Albelali S, Almulhem M, Almuhanna A, Attar R, Alsahli M, Albagmi S, Bakhshwain A, Almazrou S, Almutairi S, AboAlsamh H, Arif W, Alsadhan A, Alsedrah I, Alanezi F, Alibrahim D, Alqahtani N. Effectiveness of ChatGPT in remote learning environments: An empirical study with medical students in Saudi Arabia. Nutrition and Health 2024 View
  23. Al-Naser Y, Halka F, Ng B, Mountford D, Sharma S, Niure K, Yong-Hing C, Khosa F, Van der Pol C. Evaluating Artificial Intelligence Competency in Education: Performance of ChatGPT-4 in the American Registry of Radiologic Technologists (ARRT) Radiography Certification Exam. Academic Radiology 2025;32(2):597 View
  24. Fraga-Sastrías J, Navarrini H, Silva-Brehuer M, Espejo-González R, Olvera-Cortés H, Rubio-Martínez R. Uso de Chat-GPT para la generación y conducción de escenarios simulados para el aprendizaje de habilidades no técnicas. Revista Latinoamericana de Simulación Clínica 2024;6(2):64 View
  25. Armbruster J, Bussmann F, Rothhaas C, Titze N, Grützner P, Freischmidt H. “Doctor ChatGPT, Can You Help Me?” The Patient’s Perspective: Cross-Sectional Study. Journal of Medical Internet Research 2024;26:e58831 View
  26. Wu J, Nishida T, Liu T. Accuracy of large language models in answering ophthalmology board-style questions: A meta-analysis. Asia-Pacific Journal of Ophthalmology 2024;13(5):100106 View
  27. Abdul Sami M, Abdul Samad M, Parekh K, Suthar P. Comparative Accuracy of ChatGPT 4.0 and Google Gemini in Answering Pediatric Radiology Text-Based Questions. Cureus 2024 View
  28. Sallam M, Al-Mahzoum K, Almutairi Y, Alaqeel O, Abu Salami A, Almutairi Z, Alsarraf A, Barakat M. Anxiety among Medical Students Regarding Generative Artificial Intelligence Models: A Pilot Descriptive Study. International Medical Education 2024;3(4):406 View
  29. Le K, Chen J, Mai D, Le K. An Evaluation on the Potential of Large Language Models for Use in Trauma Triage. Emergency Care and Medicine 2024;1(4):350 View
  30. Liu M, Okuhara T, Chang X, Okada H, Kiuchi T, Khlaif Z. Performance of ChatGPT in medical licensing examinations in countries worldwide: A systematic review and meta-analysis protocol. PLOS ONE 2024;19(10):e0312771 View
  31. Harigai A, Toyama Y, Nagano M, Abe M, Kawabata M, Li L, Yamamura J, Takase K. Response accuracy of GPT-4 across languages: insights from an expert-level diagnostic radiology examination in Japan. Japanese Journal of Radiology 2025;43(2):319 View
  32. Cotohuanca Cruz S, Arredondo-Zela S, Grández-Ventura L. Uso del ChatGPT y el rendimiento académico en estudiantes de una Universidad Privada. REVISTA EDUSER 2024;11(1):29 View
  33. Alli S, Hossain S, Das S, Upshur R. The Potential of Artificial Intelligence Tools for Reducing Uncertainty in Medicine and Directions for Medical Education. JMIR Medical Education 2024;10:e51446 View
  34. Aster A, Laupichler M, Rockwell-Kollmann T, Masala G, Bala E, Raupach T. ChatGPT and Other Large Language Models in Medical Education — Scoping Literature Review. Medical Science Educator 2024;35(1):555 View
  35. Liu M, Okuhara T, Huang W, Ogihara A, Nagao H, Okada H, Kiuchi T. Large Language Models in Dental Licensing Examinations: Systematic Review and Meta-Analysis. International Dental Journal 2025;75(1):213 View
  36. Chen R, Zeng D, Li Y, Huang R, Sun D, Li T. Evaluating the performance and clinical decision‐making impact of ChatGPT‐4 in reproductive medicine. International Journal of Gynecology & Obstetrics 2025;168(3):1285 View
  37. Lee J, Park S, Shin J, Cho B. Analyzing evaluation methods for large language models in the medical field: a scoping review. BMC Medical Informatics and Decision Making 2024;24(1) View
  38. Roos J, Wilhelm T, Martin R, Kaczmarczyk R. From Language Models to Medical Diagnoses: Assessing the Potential of GPT-4 and GPT-3.5-Turbo in Digital Health. AI 2024;5(4):2680 View
  39. Zong H, Wu R, Cha J, Wang J, Wu E, Li J, Zhou Y, Zhang C, Feng W, Shen B. Large Language Models in Worldwide Medical Exams: Platform Development and Comprehensive Analysis. Journal of Medical Internet Research 2024;26:e66114 View
  40. Roos J, Martin R, Kaczmarczyk R. Evaluating Bard Gemini Pro and GPT-4 Vision Against Student Performance in Medical Visual Question Answering: Comparative Case Study. JMIR Formative Research 2024;8:e57592 View
  41. Sabaner M, Anguita R, Antaki F, Balas M, Boberg-Ans L, Ferro Desideri L, Grauslund J, Hansen M, Klefter O, Potapenko I, Rasmussen M, Subhi Y. Opportunities and Challenges of Chatbots in Ophthalmology: A Narrative Review. Journal of Personalized Medicine 2024;14(12):1165 View
  42. Özkan T, Acar A, Özkan E, Düzyol M, Öztürk E. Are artificial intelligence based chatbots reliable sources for patients regarding orthodontics?. APOS Trends in Orthodontics 2025;15:141 View
  43. Qiu Y, Liu C. Capable exam-taker and question-generator: the dual role of generative AI in medical education assessment. Global Medical Education 2025 View
  44. Ajalo E, Mukunya D, Nantale R, Kayemba F, Pangholi K, Babuya J, Langoya Akuu S, Namiiro A, Nsubuga Y, Mpagi J, Musaba M, Oguttu F, Kuteesa J, Mubuuke A, Munabi I, Kiguli S, Omara T. Widespread use of ChatGPT and other Artificial Intelligence tools among medical students in Uganda: A cross-sectional study. PLOS ONE 2025;20(1):e0313776 View
  45. Feigerlova E, Hani H, Hothersall-Davies E. A systematic review of the impact of artificial intelligence on educational outcomes in health professions education. BMC Medical Education 2025;25(1) View
  46. Nordquist J, Silva S, Caverzagie K, Hall J. Clinical learning environments: Updates. Medical Teacher 2025;47(6):911 View
  47. Erdat E, Kavak E. Benchmarking LLM chatbots’ oncological knowledge with the Turkish Society of Medical Oncology’s annual board examination questions. BMC Cancer 2025;25(1) View
  48. Salman I, Ameer O, Khanfar M, Hsieh Y. Artificial intelligence in healthcare education: evaluating the accuracy of ChatGPT, Copilot, and Google Gemini in cardiovascular pharmacology. Frontiers in Medicine 2025;12 View
  49. Murthy A, Palaniappan V, Radhakrishnan S, Rajaa S, Karthikeyan K. A Comparative Analysis of the Performance of Large Language Models and Human Respondents in Dermatology. Indian Dermatology Online Journal 2025;16(2):241 View
  50. Bolgova O, Shypilova I, Mavrych V. Large Language Models in Biochemistry Education: Comparative Evaluation of Performance. JMIR Medical Education 2025;11:e67244 View
  51. Prazeres F. ChatGPT’s Performance on Portuguese Medical Examination Questions: Comparative Analysis of ChatGPT-3.5 Turbo and ChatGPT-4o Mini. JMIR Medical Education 2025;11:e65108 View
  52. Buhl L. The answer may vary: large language model response patterns challenge their use in test item analysis. Medical Teacher 2025:1 View
  53. Yitzhaki S, Peled N, Kaplan E, Kadmon G, Nahum E, Gendler Y, Weissbach A. Comparing ChatGPT‐4 and a Paediatric Intensive Care Specialist in Responding to Medical Education Questions: A Multicenter Evaluation. Journal of Paediatrics and Child Health 2025 View
  54. Mustață M, Iliescu D, Mavriș E, Jude C, Bojor L, Tudorache P, Cîrdei I, Hrab D, Aluculesei A, Răpan I, Dan-Șuteu Ş, Roman D, Urseiu C. ChatGPT-Assisted Decision-Making: An In-Depth Exploration of the Human–AI Interaction. International Journal of Human–Computer Interaction 2025:1 View
  55. Wang L, Li J, Zhuang B, Huang S, Fang M, Wang C, Li W, Zhang M, Gong S. Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis. Journal of Medical Internet Research 2025;27:e64486 View
  56. Li Z, Yan C, Cao Y, Gong A, Li F, Zeng R. Evaluating performance of large language models for atrial fibrillation management using different prompting strategies and languages. Scientific Reports 2025;15(1) View
  57. Cheng E. Leveraging generative AI in science lesson study: transforming density concept instruction through ChatGPT integration. International Journal for Lesson & Learning Studies 2025 View

Conference Proceedings

  1. Dong B, Bai J, Xu T, Zhou Y. 2024 6th International Conference on Computer Science and Technologies in Education (CSTE). Large Language Models in Education: A Systematic Review View