Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care

doi:10.2196/46599

Published on 21.Apr.2023 in Vol 9 (2023)

This is a member publication of University of Cambridge (Jisc)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/46599, first published 20.Feb.2023.

Laptop screen showing a general practice exam with ChatGPT candidate

Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care

Arun James Thirunavukarasu¹

; Refaat Hassan¹

; Shathar Mahmood¹

; Rohan Sanghera¹

; Kara Barzangi¹

; Mohanned El Mukashfi¹

; Sachin Shah²

Article Authors Cited by (161) Tweetations (12) Metrics

Journals

Wosny M, Strasser L, Hastings J. Experience of Health Care Professionals Using Digital Tools in the Hospital: Qualitative Systematic Review. JMIR Human Factors 2023;10:e50357 View
Sallam M, Salim N, Barakat M, Al-Mahzoum K, Al-Tammemi A, Malaeb D, Hallit R, Hallit S. Assessing Health Students' Attitudes and Usage of ChatGPT in Jordan: Validation Study. JMIR Medical Education 2023;9:e48254 View
Fraser H, Crossland D, Bacher I, Ranney M, Madsen T, Hilliard R. Comparison of Diagnostic and Triage Accuracy of Ada Health and WebMD Symptom Checkers, ChatGPT, and Physicians for Patients in an Emergency Department: Clinical Data Analysis Study. JMIR mHealth and uHealth 2023;11:e49995 View
Takagi S, Watari T, Erabi A, Sakaguchi K. Performance of GPT-3.5 and GPT-4 on the Japanese Medical Licensing Examination: Comparison Study. JMIR Medical Education 2023;9:e48002 View
Borchert R, Hickman C, Pepys J, Sadler T. Performance of ChatGPT on the Situational Judgement Test—A Professional Dilemmas–Based Examination for Doctors in the United Kingdom. JMIR Medical Education 2023;9:e48978 View
Miao J, Thongprayoon C, Garcia Valencia O, Krisanapan P, Sheikh M, Davis P, Mekraksakit P, Suarez M, Craici I, Cheungpasitporn W. Performance of ChatGPT on Nephrology Test Questions. Clinical Journal of the American Society of Nephrology 2024;19(1):35 View
Corti C, Castellano G, Curigliano G. Exploring the utility and limitations of ChatGPT in scientific literature searches. ESMO Real World Data and Digital Oncology 2023;1:100001 View
Thirunavukarasu A. Large language models will not replace healthcare professionals: curbing popular fears and hype. Journal of the Royal Society of Medicine 2023;116(5):181 View
Abd-alrazaq A, AlSaad R, Alhuwail D, Ahmed A, Healy P, Latifi S, Aziz S, Damseh R, Alabed Alrazak S, Sheikh J. Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions. JMIR Medical Education 2023;9:e48291 View
Thibaut G, Dabbagh A, Liverneaux P. Does Google’s Bard Chatbot perform better than ChatGPT on the European hand surgery exam?. International Orthopaedics 2024;48(1):151 View
Morita P, Abhari S, Kaur J, Lotto M, Miranda P, Oetomo A. Applying ChatGPT in public health: a SWOT and PESTLE analysis. Frontiers in Public Health 2023;11 View
Tan T, Thirunavukarasu A, Jin L, Lim J, Poh S, Teo Z, Ang M, Chan R, Ong J, Turner A, Karlström J, Wong T, Stern J, Ting D. Artificial intelligence and digital health in global eye health: opportunities and challenges. The Lancet Global Health 2023;11(9):e1432 View
Flores-Cohaila J, García-Vicente A, Vizcarra-Jiménez S, De la Cruz-Galán J, Gutiérrez-Arratia J, Quiroga Torres B, Taype-Rondan A. Performance of ChatGPT on the Peruvian National Licensing Medical Examination: Cross-Sectional Study. JMIR Medical Education 2023;9:e48039 View
Chakraborty C, Pal S, Bhattacharya M, Dash S, Lee S. Overview of Chatbots with special emphasis on artificial intelligence-enabled ChatGPT in medical science. Frontiers in Artificial Intelligence 2023;6 View
Krusche M, Callhoff J, Knitza J, Ruffer N. Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4. Rheumatology International 2023;44(2):303 View
Tan T, Thirunavukarasu A, Campbell J, Keane P, Pasquale L, Abramoff M, Kalpathy-Cramer J, Lum F, Kim J, Baxter S, Ting D. Generative Artificial Intelligence Through ChatGPT and Other Large Language Models in Ophthalmology. Ophthalmology Science 2023;3(4):100394 View
Traoré S, Goetsch T, Muller B, Dabbagh A, Liverneaux P. Is ChatGPT able to pass the first part of the European Board of Hand Surgery diploma examination?. Hand Surgery and Rehabilitation 2023;42(4):362 View
Abi-Rafeh J, Xu H, Kazan R, Tevlin R, Furnas H. Large Language Models and Artificial Intelligence: A Primer for Plastic Surgeons on the Demonstrated and Potential Applications, Promises, and Limitations of ChatGPT. Aesthetic Surgery Journal 2024;44(3):329 View
Eguia H, Sanz García J. Inteligencia artificial, ChatGPT y atención primaria. Medicina de Familia. SEMERGEN 2023;49(7):102069 View
Thirunavukarasu A, Ting D, Elangovan K, Gutierrez L, Tan T, Ting D. Large language models in medicine. Nature Medicine 2023;29(8):1930 View
Yang R, Tan T, Lu W, Thirunavukarasu A, Ting D, Liu N. Large language models in health care: Development, applications, and challenges. Health Care Science 2023;2(4):255 View
Suárez A, Díaz‐Flores García V, Algar J, Gómez Sánchez M, Llorente de Pedro M, Freire Y. Unveiling the ChatGPT phenomenon: Evaluating the consistency and accuracy of endodontic question answers. International Endodontic Journal 2024;57(1):108 View
Ting D, Tan T, Ting D. ChatGPT in ophthalmology: the dawn of a new era?. Eye 2024;38(1):4 View
Ng F, Thirunavukarasu A, Cheng H, Tan T, Gutierrez L, Lan Y, Ong J, Chong Y, Ngiam K, Ho D, Wong T, Kwek K, Doshi-Velez F, Lucey C, Coffman T, Ting D. Artificial intelligence education: An evidence-based medicine approach for consumers, translators, and developers. Cell Reports Medicine 2023;4(10):101230 View
Miao J, Thongprayoon C, Suppadungsuk S, Garcia Valencia O, Qureshi F, Cheungpasitporn W. Innovating Personalized Nephrology Care: Exploring the Potential Utilization of ChatGPT. Journal of Personalized Medicine 2023;13(12):1681 View
Sahin M, Sozer A, Kuzucu P, Turkmen T, Sahin M, Sozer E, Tufek O, Nernekli K, Emmez H, Celtikci E. Beyond human in neurosurgical exams: ChatGPT's success in the Turkish neurosurgical society proficiency board exams. Computers in Biology and Medicine 2024;169:107807 View
Suárez A, Jiménez J, Llorente de Pedro M, Andreu-Vázquez C, Díaz-Flores García V, Gómez Sánchez M, Freire Y. Beyond the Scalpel: Assessing ChatGPT's potential as an auxiliary intelligent virtual assistant in oral surgery. Computational and Structural Biotechnology Journal 2024;24:46 View
Watari T, Takagi S, Sakaguchi K, Nishizaki Y, Shimizu T, Yamamoto Y, Tokuda Y. Performance Comparison of ChatGPT-4 and Japanese Medical Residents in the General Medicine In-Training Examination: Comparison Study. JMIR Medical Education 2023;9:e52202 View
Thirunavukarasu A. How Can the Clinical Aptitude of AI Assistants Be Assayed?. Journal of Medical Internet Research 2023;25:e51603 View
Civettini I, Zappaterra A, Granelli B, Rindone G, Aroldi A, Bonfanti S, Colombo F, Fedele M, Grillo G, Parma M, Perfetti P, Terruzzi E, Gambacorti‐Passerini C, Ramazzotti D, Cavalca F. Evaluating the performance of large language models in haematopoietic stem cell transplantation decision‐making. British Journal of Haematology 2024;204(4):1523 View
Tangadulrat P, Sono S, Tangtrakulwanich B. Using ChatGPT for Clinical Practice and Medical Education: Cross-Sectional Survey of Medical Students’ and Physicians’ Perceptions. JMIR Medical Education 2023;9:e50658 View
Kollitsch L, Eredics K, Marszalek M, Rauchenwald M, Brookman-May S, Burger M, Körner-Riffard K, May M. How does artificial intelligence master urological board examinations? A comparative analysis of different Large Language Models’ accuracy and reliability in the 2022 In-Service Assessment of the European Board of Urology. World Journal of Urology 2024;42(1) View
Al-Sharif E, Penteado R, Dib El Jalbout N, Topilow N, Shoji M, Kikkawa D, Liu C, Korn B. Evaluating the Accuracy of ChatGPT and Google BARD in Fielding Oculoplastic Patient Queries: A Comparative Study on Artificial versus Human Intelligence. Ophthalmic Plastic & Reconstructive Surgery 2024;40(3):303 View
Sezgin E. Redefining Virtual Assistants in Health Care: The Future With Large Language Models. Journal of Medical Internet Research 2024;26:e53225 View
Sallam M, Barakat M, Sallam M. A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence–Based Models in Health Care Education and Practice: Development Study Involving a Literature Review. Interactive Journal of Medical Research 2024;13:e54704 View
Chang Y, Wang X, Wang J, Wu Y, Yang L, Zhu K, Chen H, Yi X, Wang C, Wang Y, Ye W, Zhang Y, Chang Y, Yu P, Yang Q, Xie X. A Survey on Evaluation of Large Language Models. ACM Transactions on Intelligent Systems and Technology 2024;15(3):1 View
Hatia A, Doldo T, Parrini S, Chisci E, Cipriani L, Montagna L, Lagana G, Guenza G, Agosta E, Vinjolli F, Hoxha M, D’Amelio C, Favaretto N, Chisci G. Accuracy and Completeness of ChatGPT-Generated Information on Interceptive Orthodontics: A Multicenter Collaborative Study. Journal of Clinical Medicine 2024;13(3):735 View
Pereyra L, Schlottmann F, Steinberg L, Lasa J. Colorectal Cancer Prevention. Journal of Clinical Gastroenterology 2024;58(10):1022 View
Meyer A, Riese J, Streichert T. Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study. JMIR Medical Education 2024;10:e50965 View
Su M, Lin L, Lin L, Chen Y. Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam. International Journal of Nursing Studies 2024;153:104717 View
Mai D, Da C, Hanh N. The use of ChatGPT in teaching and learning: a systematic review through SWOT analysis approach. Frontiers in Education 2024;9 View
Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis. Journal of Biomedical Informatics 2024;151:104620 View
Maitland A, Fowkes R, Maitland S. Can ChatGPT pass the MRCP (UK) written examinations? Analysis of performance and errors using a clinical decision-reasoning framework. BMJ Open 2024;14(3):e080558 View
Le M, Davis M. ChatGPT Yields a Passing Score on a Pediatric Board Preparatory Exam but Raises Red Flags. Global Pediatric Health 2024;11 View
Peled T, Sela H, Weiss A, Grisaru‐Granovsky S, Agrawal S, Rottenstreich M. Evaluating the validity of ChatGPT responses on common obstetric issues: Potential clinical applications and implications. International Journal of Gynecology & Obstetrics 2024;166(3):1127 View
Zhang Y, Xu L, Ji H. Author's reply: AI in medicine, bridging the chasm between potential and capability. Digestive and Liver Disease 2024;56(6):1116 View
Laymouna M, Ma Y, Lessard D, Schuster T, Engler K, Lebouché B. Roles, Users, Benefits, and Limitations of Chatbots in Health Care: Rapid Review. Journal of Medical Internet Research 2024;26:e56930 View
Katz U, Cohen E, Shachar E, Somer J, Fink A, Morse E, Shreiber B, Wolf I. GPT versus Resident Physicians — A Benchmark Based on Official Board Scores. NEJM AI 2024;1(5) View
Zhuo K, Kim P, Kovacic J, Chalasani V, Rasiah K, Menogue S, Chung A. Can Artificial Intelligence Treat My Urinary Tract Infections?—Evaluation of Health Information Provided by OpenAI™ ChatGPT on Urinary Tract Infections. Société Internationale d’Urologie Journal 2024;5(2):104 View
Tripathi S, Sukumaran R, Dheer S, Cook T. Promptwise: Prompt Engineering Paradigm for Enhanced Patient-Large Language Model Interactions Towards Medical Education. SSRN Electronic Journal 2024 View
Nassiri K, Akhloufi M. Recent Advances in Large Language Models for Healthcare. BioMedInformatics 2024;4(2):1097 View
Thirunavukarasu A, Mahmood S, Malem A, Foster W, Sanghera R, Hassan R, Zhou S, Wong S, Wong Y, Chong Y, Shakeel A, Chang Y, Tan B, Jain N, Tan T, Rauz S, Ting D, Ting D, Luo M. Large language models approach expert-level clinical knowledge and reasoning in ophthalmology: A head-to-head cross-sectional study. PLOS Digital Health 2024;3(4):e0000341 View
Tessler I, Wolfovitz A, Alon E, Gecel N, Livneh N, Zimlichman E, Klang E. ChatGPT’s adherence to otolaryngology clinical practice guidelines. European Archives of Oto-Rhino-Laryngology 2024;281(7):3829 View
Varghese C, Harrison E, O’Grady G, Topol E. Artificial intelligence in surgery. Nature Medicine 2024;30(5):1257 View
Cong-Lem N, Soyoof A, Tsering D. A Systematic Review of the Limitations and Associated Opportunities of ChatGPT. International Journal of Human–Computer Interaction 2024:1 View
Scott I, Zuccon G. The new paradigm in machine learning – foundation models, large language models and beyond: a primer for physicians. Internal Medicine Journal 2024;54(5):705 View
Bonnechère B. Unlocking the Black Box? A Comprehensive Exploration of Large Language Models in Rehabilitation. American Journal of Physical Medicine & Rehabilitation 2024;103(6):532 View
Ozden I, Gokyar M, Ozden M, Sazak Ovecoglu H. Assessment of artificial intelligence applications in responding to dental trauma. Dental Traumatology 2024;40(6):722 View
Duggan R, Tsuruda K. ChatGPT performance on radiation technologist and therapist entry to practice exams. Journal of Medical Imaging and Radiation Sciences 2024;55(4):101426 View
Mousavi M, Shafiee S, Harley J, Cheung J, Abbasgholizadeh Rahimi S. Performance of generative pre-trained transformers (GPTs) in Certification Examination of the College of Family Physicians of Canada. Family Medicine and Community Health 2024;12(Suppl 1):e002626 View
Şan H, Bayrakcı Ö, Çağdaş B, Serdengeçti M, Alagöz E. Reliability and readability analysis of ChatGPT-4 and Google Bard as a patient information source for the most commonly applied radionuclide treatments in cancer patients. Revista Española de Medicina Nuclear e Imagen Molecular (English Edition) 2024;43(4):500021 View
Suwała S, Szulc P, Guzowski C, Kamińska B, Dorobiała J, Wojciechowska K, Berska M, Kubicka O, Kosturkiewicz O, Kosztulska B, Rajewska A, Junik R. ChatGPT-3.5 passes Poland’s medical final examination—Is it possible for ChatGPT to become a doctor in Poland?. SAGE Open Medicine 2024;12 View
Ong J, Chang S, William W, Butte A, Shah N, Chew L, Liu N, Doshi-Velez F, Lu W, Savulescu J, Ting D. Medical Ethics of Large Language Models in Medicine. NEJM AI 2024;1(7) View
Hager P, Jungmann F, Holland R, Bhagat K, Hubrecht I, Knauer M, Vielhauer J, Makowski M, Braren R, Kaissis G, Rueckert D. Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nature Medicine 2024;30(9):2613 View
Lucas M, Yang J, Pomeroy J, Yang C. Reasoning with large language models for medical question answering. Journal of the American Medical Informatics Association 2024;31(9):1964 View
Şan H, Bayrakçi Ö, Çağdaş B, Serdengeçti M, Alagöz E. Análisis de confiabilidad y lectibilidad de ChatGPT-4 y Google Gard como fuente de información del paciente para los tratamientos con radionúclidos más comúnmente aplicados en pacientes con cáncer. Revista Española de Medicina Nuclear e Imagen Molecular 2024;43(4):500021 View
Chow J, Cheng T, Chien T, Chou W. Assessing ChatGPT’s Capability for Multiple Choice Questions Using RaschOnline: Observational Study. JMIR Formative Research 2024;8:e46800 View
Moglia A, Georgiou K, Cerveri P, Mainardi L, Satava R, Cuschieri A. Large language models in healthcare: from a systematic review on medical examinations to a comparative analysis on fundamentals of robotic surgery online test. Artificial Intelligence Review 2024;57(9) View
Fatima A, Shafique M, Alam K, Fadlalla Ahmed T, Mustafa M. ChatGPT in medicine: A cross-disciplinary systematic review of ChatGPT’s (artificial intelligence) role in research, clinical practice, education, and patient interaction. Medicine 2024;103(32):e39250 View
Casey J, Dworkin M, Winschel J, Molino J, Daher M, Katarincic J, Gil J, Akelman E. ChatGPT: A concise Google alternative for people seeking accurate and comprehensive carpal tunnel syndrome information. Hand Surgery and Rehabilitation 2024;43(5):101757 View
Heinke A, Radgoudarzi N, Huang B, Baxter S. A review of ophthalmology education in the era of generative artificial intelligence. Asia-Pacific Journal of Ophthalmology 2024;13(4):100089 View
Goodings A, Kajitani S, Chhor A, Albakri A, Pastrak M, Kodancha M, Ives R, Lee Y, Kajitani K. Assessment of ChatGPT-4 in Family Medicine Board Examinations Using Advanced AI Learning and Analytical Methods: Observational Study. JMIR Medical Education 2024;10:e56128 View
Cherrez-Ojeda I, Gallardo-Bastidas J, Robles-Velasco K, Osorio M, Velez Leon E, Leon Velastegui M, Pauletto P, Aguilar-Díaz F, Squassi A, González Eras S, Cordero Carrasco E, Chavez Gonzalez K, Calderon J, Bousquet J, Bedbrook A, Faytong-Haro M. Understanding Health Care Students’ Perceptions, Beliefs, and Attitudes Toward AI-Powered Language Models: Cross-Sectional Study. JMIR Medical Education 2024;10:e51757 View
Pan G, Ni J. A cross sectional investigation of ChatGPT-like large language models application among medical students in China. BMC Medical Education 2024;24(1) View
Rodríguez Weber F, Portela Ortiz J, Enríquez Barajas A. La inteligencia artificial (IA) en la medicina y su aprendizaje. Acta Médica Grupo Ángeles 2024;22(3):261 View
Wang L, Wan Z, Ni C, Song Q, Li Y, Clayton E, Malin B, Yin Z. Applications and Concerns of ChatGPT and Other Conversational Large Language Models in Health Care: Systematic Review. Journal of Medical Internet Research 2024;26:e22769 View
Ros-Arlanzón P, Perez-Sempere A. Evaluating AI Competence in Specialized Medicine: Comparative Analysis of ChatGPT and Neurologists in a Neurology Specialist Examination in Spain. JMIR Medical Education 2024;10:e56762 View
Ronquillo J, South B, Wiemken T, Jadhav A, Watt S, De Jesus M, Habtezion A. AI for Oncology Drug Data Harmonization — Amazon versus OpenAI. NEJM AI 2024;1(11) View
Attanasio M, Mazza M, Le Donne I, Masedu F, Greco M, Valenti M. Does ChatGPT have a typical or atypical theory of mind?. Frontiers in Psychology 2024;15 View
Khabaz K, Newman‐Hung N, Kallini J, Kendal J, Christ A, Bernthal N, Wessel L. Assessment of Artificial Intelligence Chatbot Responses to Common Patient Questions on Bone Sarcoma. Journal of Surgical Oncology 2025;131(4):719 View
Holt N, Byrne M. The Role of Artificial Intelligence and Big Data for Gastrointestinal Disease. Gastrointestinal Endoscopy Clinics of North America 2025;35(2):291 View
Liu F, Chang X, Zhu Q, Huang Y, Li Y, Wang H. Assessing clinical medicine students’ acceptance of large language model: based on technology acceptance model. BMC Medical Education 2024;24(1) View
Masison J, Lehmann H, Wan J. Utilization of Computable Phenotypes in Electronic Health Record Research: A Review and Case Study in Atopic Dermatitis. Journal of Investigative Dermatology 2025;145(5):1008 View
Hassan M, Ayad M, Nembhard C, Hayes-Dixon A, Lin A, Janjua M, Franko J, Tee M. Artificial Intelligence Compared to Manual Selection of Prospective Surgical Residents. Journal of Surgical Education 2025;82(1):103308 View
Kovari A. AI for Decision Support: Balancing Accuracy, Transparency, and Trust Across Sectors. Information 2024;15(11):725 View
Ho C, Tian T, Ayers A, Aaron R, Phillips V, Wolf R, Mathioudakis N, Dai T, Klonoff D. Qualitative metrics from the biomedical literature for evaluating large language models in clinical decision-making: a narrative review. BMC Medical Informatics and Decision Making 2024;24(1) View
Lee J, Park S, Shin J, Cho B. Analyzing evaluation methods for large language models in the medical field: a scoping review. BMC Medical Informatics and Decision Making 2024;24(1) View
Abhari S, Afshari Y, Fatehi F, Salmani H, Garavand A, Chumachenko D, Zakerabasali S, Morita P. Exploring ChatGPT in clinical inquiry: a scoping review of characteristics, applications, challenges, and evaluation. Annals of Medicine & Surgery 2024;86(12):7094 View
Hasan S, Fury M, Woo J, Kunze K, Ramkumar P. Ethical Application of Generative Artificial Intelligence in Medicine. Arthroscopy 2025;41(4):874 View
Pavone M, Palmieri L, Bizzarri N, Rosati A, Campolo F, Innocenzi C, Taliento C, Restaino S, Catena U, Vizzielli G, Akladios C, Ianieri M, Marescaux J, Campo R, Fanfani F, Scambia G. Artificial Intelligence, the ChatGPT Large Language Model: Assessing the Accuracy of Responses to the Gynaecological Endoscopic Surgical Education and Assessment (GESEA) Level 1-2 knowledge tests. Facts, Views and Vision in ObGyn 2024;16(4):449 View
Bordin R, Bartnack C, Westphalen V, Gasparello G, Bark M, Gava T, Tanaka O. Evaluating generative pretraining transformer reliability in addressing dental trauma: A cross-sectional observational study on avulsion and intrusion. Saudi Endodontic Journal 2025;15(1):45 View
Arvidsson R, Gunnarsson R, Entezarjou A, Sundemo D, Wikberg C. ChatGPT (GPT-4) versus doctors on complex cases of the Swedish family medicine specialist examination: an observational comparative study. BMJ Open 2024;14(12):e086148 View
Yang Z, Yao Z, Tasmin M, Vashisht P, Jang W, Ouyang F, Wang B, McManus D, Berlowitz D, Yu H. Unveiling GPT-4V's hidden challenges behind high accuracy on USMLE questions: Observational Study. Journal of Medical Internet Research 2025;27:e65146 View
Knee C, Campbell R, Sivakumar B, Wines A, Symes M. Investigating the Accuracy and Consistency of ChatGPT in the Management of Achilles Tendon Ruptures. Cureus 2025 View
Ahmed M, Lam J, Chow A, Chow C. A Primer on Large Language Models (LLMs) and ChatGPT for Cardiovascular Healthcare Professionals. CJC Open 2025;7(5):660 View
Liu Q, Hu A, Gladman T, Gallagher S. Eight Months into Reality: A Scoping Review of the Application of ChatGPT in Higher Education Teaching and Learning. Innovative Higher Education 2025;50(5):1677 View
Li R, Wu T. Application of Artificial Intelligence Generated Content in Medical Examinations. Advances in Medical Education and Practice 2025;Volume 16:331 View
Fernández-Pichel M, Pichel J, Losada D. Evaluating search engines and large language models for answering health questions. npj Digital Medicine 2025;8(1) View
Grosser J, Düvel J, Hasemann L, Schneider E, Greiner W. Studying the Potential Effects of Artificial Intelligence on Physician Autonomy: Scoping Review. JMIR AI 2025;4:e59295 View
Li J, Yang Y, Chen R, Zheng D, Pang P, Lam C, Wong D, Wang Y, Wu J. Identifying healthcare needs with patient experience reviews using ChatGPT. PLOS ONE 2025;20(3):e0313442 View
de Almeida T, de Oliveira N, He C, Rocha C, Teixeira M, Rogers P, Kocsis K. Using Generative Pre-Trained Transformer-4 (GPT-4), ffmpeg, and Microsoft Azure to Aid in Creating a Text-to-Video Generation Tool to Improve Safety Shares and Incident Descriptions in the Mining Industry. Mining, Metallurgy & Exploration 2025;42(3):1325 View
Chen X, Xiang J, Lu S, Liu Y, He M, Shi D. Evaluating large language models and agents in healthcare: key challenges in clinical applications. Intelligent Medicine 2025;5(2):151 View
Mavrych V, Yaqinuddin A, Bolgova O. Claude, ChatGPT, Copilot, and Gemini performance versus students in different topics of neuroscience. Advances in Physiology Education 2025;49(2):430 View
Qin H, Tong Y. Opportunities and Challenges for Large Language Models in Primary Health Care. Journal of Primary Care & Community Health 2025;16 View
Özdemir Ö, Güven Y. ChatGPT’nin Diş Hekimliğinde Kullanım Alanları ve Sınırlamaları. Selcuk Dental Journal 2025;12(1):184 View
Hasan S, Ipaktchi K, Meyer N, Liverneaux P. Comparison of hand surgery certification exams in Europe and the United States using ChatGPT 4.0. Journal of Hand and Microsurgery 2025;17(4):100258 View
Choudhury A, Shahsavar Y, Shamszare H. User Intent to Use DeepSeek for Health Care Purposes and Their Trust in the Large Language Model: Multinational Survey Study. JMIR Human Factors 2025;12:e72867 View
Ejas F, Khan S, Mujahid A, AlJoker F, Mautong H, Alvarado-Villa G, Kashyap A, Yasir M, Nigatu K, Jain N, Iyer N, Sandhu A, Sharafat S, Yahya S, Ghaly M, Ibrar I, Singh A, Grewal H, Huespe I, Mehta P, Arshad Z, Kashyap R, Nawaz F. Medical Students’ Perceptions of Large Language Models in Healthcare: A Multinational Cross-Sectional Study. Journal of Medical Education and Curricular Development 2025;12 View
Mert S, Muir L, Fuchs B, Lucksch V, Vollbach F, Haas-Lützenberger E, Giunta R, Thierfelder N, Demmer W. Can artificial intelligence pass the written European Board of Hand Surgery exam?. Hand Surgery and Rehabilitation 2025;44(4):102197 View
Yousefi F, Dehnavieh R, Laberge M, Gagnon M, Ghaemi M, Nadali M, Azizi N. Opportunities, challenges, and requirements for Artificial Intelligence (AI) implementation in Primary Health Care (PHC): a systematic review. BMC Primary Care 2025;26(1) View
See Y, Lim K, Au W, Chia S, Fan X, Li Z. The Use of Large Language Models in Ophthalmology: A Scoping Review on Current Use-Cases and Considerations for Future Works in This Field. Big Data and Cognitive Computing 2025;9(6):151 View
Vural Camalan B, Doluoglu S, Taraf N, Gunay M, Ozlugedik S. ChatGPT versus DeepSeek in head and neck cancer staging and treatment planning: guideline-based study. European Archives of Oto-Rhino-Laryngology 2025;282(9):4815 View
Shyamsukha B, Bagde H, Sharan A, Choudhary M, Duble A, Dhan A. Evaluating the Potential of ChatGPT as a Supplementary Intelligent Virtual Assistant in Periodontology. Journal of Pharmacy and Bioallied Sciences 2025;17(Suppl 2):S1415 View
Chen Y, Hu G, Xia C, Yu L, Popescu M, Xu D. Large Language Models and Prompt-Based Learning: Frontiers and Challenges in Cross-Disciplinary Applications. International Journal of Artificial Intelligence and Robotics Research 2025;02 View
Sun K, Wang Y, Qu R, Yang Q, Luo R, Jiang Z, Wang H, Fu W. Comprehensive application of artificial intelligence in colorectal cancer: A review. iScience 2025;28(7):112980 View
Bentegeac R, Le Guellec B, Kuchcinski G, Amouyel P, Hamroun A. Token Probabilities to Mitigate Large Language Models Overconfidence in Answering Medical Questions: Quantitative Study. Journal of Medical Internet Research 2025;27:e64348 View
Liu Z, Zou W, Yang S. What Drives Patient Preferences for AI Chatbots Over Doctors? A Survey Study Using the O-S-O-R Model. Health Communication 2026;41(5):899 View
Kumar A, Gupta P, Pandey A. Is ChatGPT a Reliable Auxiliary Tool in Basic Life Support Training and Education? A Cross-sectional Study. Indian Journal of Critical Care Medicine 2025;29(8):684 View
Guo S, Song Y, Chen G, Han H, Wu H, Ma J. Promoting trust and intention to adopt health information generated by ChatGPT among healthcare customers: An empirical study. DIGITAL HEALTH 2025;11 View
Zheng H, Dong H, Zhao H. Trends and advances in ChatGPT applications in ophthalmology. Journal Français d'Ophtalmologie 2025;48(8):104622 View
Sanker V, Nordin E, Heesen P, Elfadali M, Anwar M, Chintapalli R, Cavagnaro M, Zygourakis C, Ratliff J, Desai A. Current trends and future prospects of language models and processing systems in spine surgery – a scoping review. Neurosurgical Review 2025;48(1) View
Ghasemi S, Amiri P, Galavi Z. Advantages and Limitations of ChatGPT in Healthcare: A Scoping Review. Health Science Reports 2025;8(9) View
Morales Morillo M, Iturralde Fernández N, Pellicer Castillo L, Suarez A, Freire Y, Diaz-Flores García V. Performance of ChatGPT-4 as an Auxiliary Tool: Evaluation of Accuracy and Repeatability on Orthodontic Radiology Questions. Bioengineering 2025;12(10):1031 View
Hu M, Zou P, Li T, Wang Y, Liu B. Evaluating the quality of ChatGPT-generated medical information on major ophthalmic conditions: A comparative assessment against the EQIP tool and guidelines. PLOS One 2025;20(10):e0334250 View
Balcı M, Akdemir C, Yıldırım F. Is Artificial Intelligence-Assisted Pregnancy Counseling Feasible? An Evaluation of the Quality of ChatGPT Responses. Cukurova Anestezi ve Cerrahi Bilimler Dergisi 2025;8(3):336 View
Tao L, Liu J, Lu X, Zhao Y, Zhang Y, Zhu Z, Li T, Zhang Z, Zhang Y, Yan W, Liu M, Liang W. Performance of the large language model in general medicine. Global Transitions 2026;8(1):101 View
Herzog C, Branford J. Relational Ethics and Structural Epistemic Injustice of AI in Medicine. Philosophy & Technology 2025;38(4) View
Hong H, Shah N, Pfeffer M, Lehmann L. Physician Perspectives on Large Language Models in Health Care: A Cross-Sectional Survey Study. Applied Clinical Informatics 2025;16(05):1738 View
Chen Z, Liu R, Huang S, Guo Y, Ren Y. A Survey of Large-Scale Deep Learning Models in Medicine and Healthcare. Computer Modeling in Engineering & Sciences 2025;144(1):37 View
Pavone M, Innocenzi C, Macellari N, Cantarini C, Criscione M, Rosati A, Lecointre L, Carcagnì A, Costantini B, Marescaux J, Fagotti A, Fanfani F, Cibula D, Querleu D, Bizzarri N. Assessing the Accuracy of Large Language Models on European Guidelines for Cervical Cancer: An In Silico Benchmarking Study. BJOG: An International Journal of Obstetrics & Gynaecology 2026;133(4):771 View
Zhou J, Zhang W, Liu S. The evolving landscape of artificial intelligence in patient education: A bibliometric knowledge mapping study. DIGITAL HEALTH 2026;12 View
沈泽. Large Language Models Reshaping the Future of Surgery—Advances in the Application of Large Language Models in the Field of Surgery. Advances in Clinical Medicine 2026;16(01):588 View
Elmas A, Akçam M. Evaluating the Quality of Responses by ChatGPT-3.5, Google Gemini, and Microsoft Copilot to Common Pediatric Questions: A Content-Based Assessment. Güncel Pediatri 2025;23(3) View
Liao W, Li M, Ma C, Han Y, Wang D, Liu H, Wang Y, Feng Z, Wang H, Guan Y. Developing a Quality Evaluation Index System for Health Conversational Artificial Intelligence: Mixed Methods Study. Journal of Medical Internet Research 2026;28:e83188 View
Xu G, Jiang H, Nguyen S, Wu D, Liu Z, Sha F, Li Y. Obstet-LLM: Large Language Model for Early Prediction of SGA-LGA Newborns. Journal of Computer Science and Technology 2026;41(2):638 View
Cheng F, Zouhar V, Chan R, Fürst D, Strobelt H, El-Assady M. Understanding Large Language Model Behaviors Through Interactive Counterfactual Generation and Analysis. IEEE Transactions on Visualization and Computer Graphics 2026;32(1):846 View
Özdal Zincir Ö, Çifçi Özkan E, Hatipoğlu Ş. Assessing the Quality of AI-Generated Responses to Botulinum Toxin Applications in Bruxism Therapy. Journal of Craniofacial Surgery 2026;37(3/4):658 View
Shepherd J, Talavia T, Raval P, Phadnis J, Singh H. Exploring the knowledge base of ChatGPT in lateral elbow tendinopathy. Shoulder & Elbow 2026 View
Wutz M, Söling S, Köberlein-Neu J. Physicians’ expectations of the use of conversational agents in healthcare: a qualitative study. BMC Health Services Research 2026;26(1) View
Muthuvairam Subbulakshmi M, Jayaraman M, Jeyaraman J. Advancing digital mental healthcare: the role of artificial intelligence and natural language processing in powering medical chatbots for future healthcare interventions. Expert Review of Medical Devices 2026;23(6):597 View
Yin C, Sun X, Dai A, Ye X, Lu Y, Wang W, Chen Y, Jiang H, Yu J, Chong S, Jiang M, Xu J, Yang B, Chappa R, Chokkakula S, He K. ChatGPT in precision medicine. APL Bioengineering 2026;10(1) View
Luo W, Feng T, Zhang T, Chen X, Lu X, Li Y, Hou C, Gao J. Attitudes and perceptions of the application of large language models among health professionals: A mixed-methods systematic review. Public Health 2026;254:106252 View
Kayira A, Elyazori H, Lybarger K, Walter F, Chelala C, Funston G. Natural Language Processing of Clinical Notes for Cancer Research and Patient Care Prior to Widespread Adoption of Generative AI: Scoping Review. JMIR AI 2026;5:e73481 View
Hall A, Patel A, Shariati K, Perrotta A, Argame A, Tseng C, Tseng C, Hidalgo M, Lee J. Generative AI in Surgical Care: Evaluating Large Language Model Performance in Patient Education. Journal of Artificial Intelligence for Medical Sciences 2025;6(1-4):54 View
Schönberg N, Deschler D, Hauer J, Zeumer M. A Comparative Evaluation of Large Language Models on Pediatric Board-Style Examinations. Hospital Pediatrics 2026;16(6):e417 View
Super I, Bawa H, Asan O. Exploring the Role of Large Language Models in Primary Care: Qualitative Study of Physicians in the United States and the Netherlands. JMIR Medical Informatics 2026;14:e91652 View
Kuşçu A, Çınarer G, Kuşçu S. Evaluation of the performance of artificial intelligence platforms in answering and generating new questions in prosthetic dentistry specialization. Anatolian Current Medical Journal 2026;8(3):409 View
Luo X, Liu Y, Ma Y, Li C, Yao M, Xiang H, Qin X, Liu J, Zhan X, Zhang C, Zang S, Deng K, Li L, Sun X. A Large Language Model Pipeline for Stroke Staging Using Electronical Medical Records. Journal of Evidence-Based Medicine 2026;19(2) View
Heyat M, Rehman A, Zeeshan H, Hayat M, Akhtar F, Sadaf , Ansari M, Wang L, Lai D, Prasath V, Gandhi T, Sawan M. Large Language Model: Future of Healthcare Research With Challenges. WIREs Data Mining and Knowledge Discovery 2026;16(3) View
Bayırlı A, Uytun M, Erdem R, Genç Y. Comparative Evaluation of Chatgpt, Gemini and Grok with and without Deep Research Mode in Answering Bone Augmentation Queries. Mugla Journal of Science and Technology 2026;12(1):85 View
Wang X. Large language models for evidence-based planning: Evaluating an SLR-RAG framework for knowledge synthesis of urban vacant land. Transactions in Urban Data, Science, and Technology 2026 View

Books/Policy Documents

Fernández-Pichel M, Losada D, Pichel J. Computational Science – ICCS 2024. View
Sanmukh S, Krzykawska-Serda M, Dragan P, Baron S, Lobaccaro J, Latek D. . View
Di Ieva A, Stewart C, Suero Molina E. Computational Neurosurgery. View
Zhu M, Nguyen P. Natural Language Processing and Information Systems. View
Fosso-Wamba S, Guthrie C. ICT: Applications and Social Interfaces. View
Kaur U, Gupta N, Kaur H. Precision Healthcare: Patient Care, Decision Tools, Wearables, Legal and Ethical Issues. View

Conference Proceedings

Baldassarre M, Caivano D, Fernandez Nieto B, Gigante D, Ragone A. Proceedings of the 2023 ACM Conference on Information Technology for Social Good. The Social Impact of Generative AI: An Analysis on ChatGPT View
Jin Y, Chandra M, Verma G, Hu Y, De Choudhury M, Kumar S. Proceedings of the ACM Web Conference 2024. Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries View
Becker L, Pracht P, Sertdal P, Uboreck J, Bendel A, Martin R. 2024 IEEE Spoken Language Technology Workshop (SLT). Conditional Label Smoothing For LLM-Based Data Augmentation in Medical Text Classification View
Alif H, Mashrafi M, Islam S, Fahim A, Nath A, Ahmed J. 2025 5th Asian Conference on Innovation in Technology (ASIANCON). Evaluating LLMs in Higher Education: A SEM Based Comparison of ChatGPT and DeepSeek Using the IS Success Model View

Citation

Please cite as:

Thirunavukarasu AJ, Hassan R, Mahmood S, Sanghera R, Barzangi K, El Mukashfi M, Shah S
Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care
JMIR Med Educ 2023;9:e46599
doi: 10.2196/46599 PMID: 37083633 PMCID: 10163403

Export Metadata

END for: Endnote

BibTeX for: BibDesk, LaTeX

RIS for: RefMan, Procite, Endnote, RefWorks

Add this article to your Mendeley library

This paper is in the following e-collection/theme issue:

Artificial Intelligence (AI) in Medical Education (682) New Resources for Medical Education (199) Chatbots and Conversational Agents (1142) Theme Issue: ChatGPT and Generative Language Models in Medical Education (144) Generative Language Models Including ChatGPT (1442)

Download

Download PDF Download XML

Share Article

Share on Bluesky Share on Twitter Share on Facebook Share on LinkedIn