Published on in Vol 9 (2023)
Preprints (earlier versions) of this paper are
available at
https://preprints.jmir.org/preprint/52202, first published
.

Journals
- Noda M, Ueno T, Koshu R, Takaso Y, Shimada M, Saito C, Sugimoto H, Fushiki H, Ito M, Nomura A, Yoshizaki T. Performance of GPT-4V in Answering the Japanese Otolaryngology Board Certification Examination Questions: Evaluation Study. JMIR Medical Education 2024;10:e57054 View
- Gravina A, Pellegrino R, Palladino G, Imperio G, Ventura A, Federico A. Charting new AI education in gastroenterology: Cross-sectional evaluation of ChatGPT and perplexity AI in medical residency exam. Digestive and Liver Disease 2024;56(8):1304 View
- Wang S, Mo C, Chen Y, Dai X, Wang H, Shen X. Exploring the Performance of ChatGPT-4 in the Taiwan Audiologist Qualification Examination: Preliminary Observational Study Highlighting the Potential of AI Chatbots in Hearing Care. JMIR Medical Education 2024;10:e55595 View
- GURBUZ D, VARIS E. Is ChatGPT knowledgeable of acute coronary syndromes and pertinent European Society of Cardiology Guidelines?. Minerva Cardiology and Angiology 2024;72(3) View
- Liu M, Okuhara T, Chang X, Shirabe R, Nishiie Y, Okada H, Kiuchi T. Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis. Journal of Medical Internet Research 2024;26:e60807 View
- Takahashi H, Shikino K, Kondo T, Komori A, Yamada Y, Saita M, Naito T. Educational Utility of Clinical Vignettes Generated in Japanese by ChatGPT-4: Mixed Methods Study. JMIR Medical Education 2024;10:e59133 View
- Sallam M, Al-Mahzoum K, Alshuaib O, Alhajri H, Alotaibi F, Alkhurainej D, Al-Balwah M, Barakat M, Egger J. Language discrepancies in the performance of generative artificial intelligence models: an examination of infectious disease queries in English and Arabic. BMC Infectious Diseases 2024;24(1) View
- Liu M, Okuhara T, Dai Z, Huang W, Gu L, Okada H, Furukawa E, Kiuchi T. Evaluating the Effectiveness of advanced large language models in medical Knowledge: A Comparative study using Japanese national medical examination. International Journal of Medical Informatics 2025;193:105673 View
- Eoh K, Kwon G, Lee E, Lee J, Lee I, Kim Y, Nam E. Efficacy of large language models and their potential in Obstetrics and Gynecology education. Obstetrics & Gynecology Science 2024;67(6):550 View
- Ho C, Tian T, Ayers A, Aaron R, Phillips V, Wolf R, Mathioudakis N, Dai T, Klonoff D. Qualitative metrics from the biomedical literature for evaluating large language models in clinical decision-making: a narrative review. BMC Medical Informatics and Decision Making 2024;24(1) View
- Huang T, Hsieh P, Chang Y. Performance Comparison of Junior Residents and ChatGPT in the Objective Structured Clinical Examination (OSCE) for Medical History Taking and Documentation of Medical Records: Development and Usability Study. JMIR Medical Education 2024;10:e59902 View
- Burisch C, Bellary A, Breuckmann F, Ehlers J, Thal S, Sellmann T, Gödde D. ChatGPT-4 Performance on German Continuing Medical Education—Friend or Foe (Trick or Treat)? Protocol for a Randomized Controlled Trial. JMIR Research Protocols 2025;14:e63887 View
- Fukushima T, Manabe M, Yada S, Wakamiya S, Yoshida A, Urakawa Y, Maeda A, Kan S, Takahashi M, Aramaki E. Evaluating and Enhancing Japanese Large Language Models for Genetic Counseling Support: Comparative Study of Domain Adaptation and the Development of an Expert-Evaluated Dataset. JMIR Medical Informatics 2025;13:e65047 View
- Xiao J, Li M, Cai R, Huang H, Yu H, Huang L, Li J, Yu T, Zhang J, Cheng S. Smart Pharmaceutical Monitoring System With Personalized Medication Schedules and Self-Management Programs for Patients With Diabetes: Development and Evaluation Study. Journal of Medical Internet Research 2025;27:e56737 View
- Gungor N, Esen F, Tasci T, Gungor K, Cil K. Navigating Gynecological Oncology with Different Versions of ChatGPT: A Transformative Breakthrough or the Next Black Box Challenge?. Oncology Research and Treatment 2024;48(3):102 View
- Ye H, Xu J, Huang D, Xie M, Guo J, Yang J, Bao H, Zhang M, Zheng C. Assessment of large language models’ performances and hallucinations for Chinese postgraduate medical entrance examination. Discover Education 2025;4(1) View
- Tseng L, Lu Y, Tseng L, Chen Y, Chen H. Performance of ChatGPT-4 on Taiwanese Traditional Chinese Medicine Licensing Examinations: Cross-Sectional Study. JMIR Medical Education 2025;11:e58897 View
- Matsutomo N, Fukami M, Yamamoto T. Can interactive artificial intelligence be used for patient explanations of nuclear medicine examinations in Japanese?. Annals of Nuclear Medicine 2025;39(8):774 View
- Aydın A, Reis D. ChatGPT 3.5, ChatGPT 4.0 ve Hemşirelik Öğrencilerinin Çocuk Acillerde Hemşirelik Yaklaşımı Dersi Sınavındaki Performans Karşılaştırmaları. Bandırma Onyedi Eylül Üniversitesi Sağlık Bilimleri ve Araştırmaları Dergisi 2025;7(1):73 View
- Wang L, Li J, Zhuang B, Huang S, Fang M, Wang C, Li W, Zhang M, Gong S. Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis. Journal of Medical Internet Research 2025;27:e64486 View
- Fukushima M, Eshita S, Fukuhara H. Advancements and limitations of LLMs in replicating human color-word associations. Discover Artificial Intelligence 2025;5(1) View
- Liu H, Chen S, Wang W, Lee C, Hsu H, Shen S, Chiou H, Lee W. Evaluating Large Language Models for Enhancing Radiology Specialty Examination: A Comparative Study with Human Performance. Academic Radiology 2025;32(9):4974 View
- Meyer B, Kfuri‐Rubens R, Schmidt G, Tariq M, Riedel C, Recker F, Riedel F, Kiechle M, Riedel M. Exploring the potential of AI‐powered applications for clinical decision‐making in gynecologic oncology. International Journal of Gynecology & Obstetrics 2025;171(2):698 View
- Silveira J. Comments From the Editor: Generative AI in Research. Update: Applications of Research in Music Education 2025;43(3):3 View
- Forster P, Käsbohrer A, Cramer H, Frass M, Maeschli A, Martin D, Panhofer P, Stetina B, Wolf U, Zentek J, Weiermayer P, Thiyagarajan K. CIMUVET-survey: Complementary and Integrative Medicine (CIM) use in veterinary practice in Austria and CIM education at universities in Austria, Germany and Switzerland. PLOS One 2025;20(7):e0327599 View
- Feitosa Filho H, Furtado J, Eulálio E, Ribeiro P, Paiva L, Correia M, Silva Júnior G. ChatGPT performance in answering medical residency questions in nephrology: a pilot study in Brazil. Brazilian Journal of Nephrology 2025;47(4) View
- Feitosa Filho H, Furtado J, Eulálio E, Ribeiro P, Paiva L, Correia M, Silva Júnior G. Desempenho do ChatGPT na resposta a questões de residência médica em Nefrologia: um estudo piloto no Brasil. Brazilian Journal of Nephrology 2025;47(4) View
- Stimmer L, Kuiper R, Polledo L, Ressel L, Rodriguez J, Veiga I, Williams J, Herder V. Natural language processing in veterinary pathology: A review. Veterinary Pathology 2025;62(6):829 View
- Gilardi N, Ballabio M, Ravera F, Ferrando L, Stabile M, Bellodi A, Talerico G, Cigolini B, Genova C, Carbone F, Montecucco F, Bracco C, Ballestrero A, Zoppoli G. Influence of medical educational background on the diagnostic quality of ChatGPT‐4 responses in internal medicine: A pilot study. European Journal of Clinical Investigation 2025;55(11) View
- Jaleel A, Aziz U, Farid G, Zahid Bashir M, Mirza T, Khizar Abbas S, Aslam S, Sikander R. Evaluating the Potential and Accuracy of ChatGPT-3.5 and 4.0 in Medical Licensing and In-Training Examinations: Systematic Review and Meta-Analysis. JMIR Medical Education 2025;11:e68070 View
- Lin Y, Luo Z, Ye Z, Zhong N, Zhao L, Zhang L, Li X, Chen Z, Chen Y. Applications, Challenges, and Prospects of Generative Artificial Intelligence Empowering Medical Education: Scoping Review. JMIR Medical Education 2025;11:e71125 View
- Chan M, Tjio C, Chan T, Tan Y, Chua A, Loh S, Leow G, Gan M, Lim X, Choo A, Liu Y, Tan J, Teo E, Yap Q, Yonghan T, Makmur A, Kumar N, Tan J, Hallinan J. Large Language Model (LLM)-Predicted and LLM-Assisted Calculation of the Spinal Instability Neoplastic Score (SINS) Improves Clinician Accuracy and Efficiency. Cancers 2025;17(19):3198 View
- Shaikh Y, Jeelani-Shaikh Z, Jeelani M, Javaid A, Mahmud T, Gaglani S, Gibbons M, Cheema M, Cross A, Livingston D, Cheatham M, Nezami E, Dixon R, Niranjan-Azadi A, Zafar S, Siddiqui Z, Villanueva C. Collaborative intelligence in AI: Evaluating the performance of a council of AIs on the USMLE. PLOS Digital Health 2025;4(10):e0000787 View
- Sun R, Hu X, Shao Y, Luo Z, Liu B, Cheng Y. Using Large Language Models to Analyze Interviews for Driver Psychological Assessment: A Performance Comparison of ChatGPT and Google-Gemini. Symmetry 2025;17(10):1713 View
- Warlick A, Clifton C, Trinh T, Kaur R, Weinberg A, Collins J. Integrating a chatbot into simulation-based perfusion training: A pilot randomized controlled trial. Perfusion 2025 View
- Aphale P, Shekhar H, Dokania S. From Accuracy to Applicability: Rethinking Large Language Model Integration in Radiology Exam Design. Academic Radiology 2025 View
- Qi B, Zheng Y, Wang Y, Xu L. Comparison of ChatGPT and DeepSeek on a Standardized Audiologist Qualification Examination in Chinese: Observational Study. JMIR Formative Research 2025;9:e79534 View
