Published on in Vol 9 (2023)

This is a member publication of University of Cambridge (Jisc)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/46599, first published .
Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care

Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care

Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care

Journals

  1. Wosny M, Strasser L, Hastings J. Experience of Health Care Professionals Using Digital Tools in the Hospital: Qualitative Systematic Review. JMIR Human Factors 2023;10:e50357 View
  2. Sallam M, Salim N, Barakat M, Al-Mahzoum K, Al-Tammemi A, Malaeb D, Hallit R, Hallit S. Assessing Health Students' Attitudes and Usage of ChatGPT in Jordan: Validation Study. JMIR Medical Education 2023;9:e48254 View
  3. Fraser H, Crossland D, Bacher I, Ranney M, Madsen T, Hilliard R. Comparison of Diagnostic and Triage Accuracy of Ada Health and WebMD Symptom Checkers, ChatGPT, and Physicians for Patients in an Emergency Department: Clinical Data Analysis Study. JMIR mHealth and uHealth 2023;11:e49995 View
  4. Takagi S, Watari T, Erabi A, Sakaguchi K. Performance of GPT-3.5 and GPT-4 on the Japanese Medical Licensing Examination: Comparison Study. JMIR Medical Education 2023;9:e48002 View
  5. Borchert R, Hickman C, Pepys J, Sadler T. Performance of ChatGPT on the Situational Judgement Test—A Professional Dilemmas–Based Examination for Doctors in the United Kingdom. JMIR Medical Education 2023;9:e48978 View
  6. Miao J, Thongprayoon C, Garcia Valencia O, Krisanapan P, Sheikh M, Davis P, Mekraksakit P, Suarez M, Craici I, Cheungpasitporn W. Performance of ChatGPT on Nephrology Test Questions. Clinical Journal of the American Society of Nephrology 2024;19(1):35 View
  7. Corti C, Castellano G, Curigliano G. Exploring the utility and limitations of ChatGPT in scientific literature searches. ESMO Real World Data and Digital Oncology 2023;1:100001 View
  8. Thirunavukarasu A. Large language models will not replace healthcare professionals: curbing popular fears and hype. Journal of the Royal Society of Medicine 2023;116(5):181 View
  9. Abd-alrazaq A, AlSaad R, Alhuwail D, Ahmed A, Healy P, Latifi S, Aziz S, Damseh R, Alabed Alrazak S, Sheikh J. Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions. JMIR Medical Education 2023;9:e48291 View
  10. Thibaut G, Dabbagh A, Liverneaux P. Does Google’s Bard Chatbot perform better than ChatGPT on the European hand surgery exam?. International Orthopaedics 2024;48(1):151 View
  11. Morita P, Abhari S, Kaur J, Lotto M, Miranda P, Oetomo A. Applying ChatGPT in public health: a SWOT and PESTLE analysis. Frontiers in Public Health 2023;11 View
  12. Tan T, Thirunavukarasu A, Jin L, Lim J, Poh S, Teo Z, Ang M, Chan R, Ong J, Turner A, Karlström J, Wong T, Stern J, Ting D. Artificial intelligence and digital health in global eye health: opportunities and challenges. The Lancet Global Health 2023;11(9):e1432 View
  13. Flores-Cohaila J, García-Vicente A, Vizcarra-Jiménez S, De la Cruz-Galán J, Gutiérrez-Arratia J, Quiroga Torres B, Taype-Rondan A. Performance of ChatGPT on the Peruvian National Licensing Medical Examination: Cross-Sectional Study. JMIR Medical Education 2023;9:e48039 View
  14. Chakraborty C, Pal S, Bhattacharya M, Dash S, Lee S. Overview of Chatbots with special emphasis on artificial intelligence-enabled ChatGPT in medical science. Frontiers in Artificial Intelligence 2023;6 View
  15. Krusche M, Callhoff J, Knitza J, Ruffer N. Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4. Rheumatology International 2023;44(2):303 View
  16. Tan T, Thirunavukarasu A, Campbell J, Keane P, Pasquale L, Abramoff M, Kalpathy-Cramer J, Lum F, Kim J, Baxter S, Ting D. Generative Artificial Intelligence Through ChatGPT and Other Large Language Models in Ophthalmology. Ophthalmology Science 2023;3(4):100394 View
  17. Traoré S, Goetsch T, Muller B, Dabbagh A, Liverneaux P. Is ChatGPT able to pass the first part of the European Board of Hand Surgery diploma examination?. Hand Surgery and Rehabilitation 2023;42(4):362 View
  18. Abi-Rafeh J, Xu H, Kazan R, Tevlin R, Furnas H. Large Language Models and Artificial Intelligence: A Primer for Plastic Surgeons on the Demonstrated and Potential Applications, Promises, and Limitations of ChatGPT. Aesthetic Surgery Journal 2024;44(3):329 View
  19. Eguia H, Sanz García J. Inteligencia artificial, ChatGPT y atención primaria. Medicina de Familia. SEMERGEN 2023;49(7):102069 View
  20. Thirunavukarasu A, Ting D, Elangovan K, Gutierrez L, Tan T, Ting D. Large language models in medicine. Nature Medicine 2023;29(8):1930 View
  21. Yang R, Tan T, Lu W, Thirunavukarasu A, Ting D, Liu N. Large language models in health care: Development, applications, and challenges. Health Care Science 2023;2(4):255 View
  22. Suárez A, Díaz‐Flores García V, Algar J, Gómez Sánchez M, Llorente de Pedro M, Freire Y. Unveiling the ChatGPT phenomenon: Evaluating the consistency and accuracy of endodontic question answers. International Endodontic Journal 2024;57(1):108 View
  23. Ting D, Tan T, Ting D. ChatGPT in ophthalmology: the dawn of a new era?. Eye 2024;38(1):4 View
  24. Ng F, Thirunavukarasu A, Cheng H, Tan T, Gutierrez L, Lan Y, Ong J, Chong Y, Ngiam K, Ho D, Wong T, Kwek K, Doshi-Velez F, Lucey C, Coffman T, Ting D. Artificial intelligence education: An evidence-based medicine approach for consumers, translators, and developers. Cell Reports Medicine 2023;4(10):101230 View
  25. Miao J, Thongprayoon C, Suppadungsuk S, Garcia Valencia O, Qureshi F, Cheungpasitporn W. Innovating Personalized Nephrology Care: Exploring the Potential Utilization of ChatGPT. Journal of Personalized Medicine 2023;13(12):1681 View
  26. Sahin M, Sozer A, Kuzucu P, Turkmen T, Sahin M, Sozer E, Tufek O, Nernekli K, Emmez H, Celtikci E. Beyond human in neurosurgical exams: ChatGPT's success in the Turkish neurosurgical society proficiency board exams. Computers in Biology and Medicine 2024;169:107807 View
  27. Suárez A, Jiménez J, Llorente de Pedro M, Andreu-Vázquez C, Díaz-Flores García V, Gómez Sánchez M, Freire Y. Beyond the Scalpel: Assessing ChatGPT's potential as an auxiliary intelligent virtual assistant in oral surgery. Computational and Structural Biotechnology Journal 2024;24:46 View
  28. Watari T, Takagi S, Sakaguchi K, Nishizaki Y, Shimizu T, Yamamoto Y, Tokuda Y. Performance Comparison of ChatGPT-4 and Japanese Medical Residents in the General Medicine In-Training Examination: Comparison Study. JMIR Medical Education 2023;9:e52202 View
  29. Thirunavukarasu A. How Can the Clinical Aptitude of AI Assistants Be Assayed?. Journal of Medical Internet Research 2023;25:e51603 View
  30. Civettini I, Zappaterra A, Granelli B, Rindone G, Aroldi A, Bonfanti S, Colombo F, Fedele M, Grillo G, Parma M, Perfetti P, Terruzzi E, Gambacorti‐Passerini C, Ramazzotti D, Cavalca F. Evaluating the performance of large language models in haematopoietic stem cell transplantation decision‐making. British Journal of Haematology 2024;204(4):1523 View
  31. Tangadulrat P, Sono S, Tangtrakulwanich B. Using ChatGPT for Clinical Practice and Medical Education: Cross-Sectional Survey of Medical Students’ and Physicians’ Perceptions. JMIR Medical Education 2023;9:e50658 View
  32. Kollitsch L, Eredics K, Marszalek M, Rauchenwald M, Brookman-May S, Burger M, Körner-Riffard K, May M. How does artificial intelligence master urological board examinations? A comparative analysis of different Large Language Models’ accuracy and reliability in the 2022 In-Service Assessment of the European Board of Urology. World Journal of Urology 2024;42(1) View
  33. Al-Sharif E, Penteado R, Dib El Jalbout N, Topilow N, Shoji M, Kikkawa D, Liu C, Korn B. Evaluating the Accuracy of ChatGPT and Google BARD in Fielding Oculoplastic Patient Queries: A Comparative Study on Artificial versus Human Intelligence. Ophthalmic Plastic & Reconstructive Surgery 2024;40(3):303 View
  34. Sezgin E. Redefining Virtual Assistants in Health Care: The Future With Large Language Models. Journal of Medical Internet Research 2024;26:e53225 View
  35. Sallam M, Barakat M, Sallam M. A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence–Based Models in Health Care Education and Practice: Development Study Involving a Literature Review. Interactive Journal of Medical Research 2024;13:e54704 View
  36. Chang Y, Wang X, Wang J, Wu Y, Yang L, Zhu K, Chen H, Yi X, Wang C, Wang Y, Ye W, Zhang Y, Chang Y, Yu P, Yang Q, Xie X. A Survey on Evaluation of Large Language Models. ACM Transactions on Intelligent Systems and Technology 2024;15(3):1 View
  37. Hatia A, Doldo T, Parrini S, Chisci E, Cipriani L, Montagna L, Lagana G, Guenza G, Agosta E, Vinjolli F, Hoxha M, D’Amelio C, Favaretto N, Chisci G. Accuracy and Completeness of ChatGPT-Generated Information on Interceptive Orthodontics: A Multicenter Collaborative Study. Journal of Clinical Medicine 2024;13(3):735 View
  38. Pereyra L, Schlottmann F, Steinberg L, Lasa J. Colorectal Cancer Prevention. Journal of Clinical Gastroenterology 2024;58(10):1022 View
  39. Meyer A, Riese J, Streichert T. Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study. JMIR Medical Education 2024;10:e50965 View
  40. Su M, Lin L, Lin L, Chen Y. Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam. International Journal of Nursing Studies 2024;153:104717 View
  41. Mai D, Da C, Hanh N. The use of ChatGPT in teaching and learning: a systematic review through SWOT analysis approach. Frontiers in Education 2024;9 View
  42. Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis. Journal of Biomedical Informatics 2024;151:104620 View
  43. Maitland A, Fowkes R, Maitland S. Can ChatGPT pass the MRCP (UK) written examinations? Analysis of performance and errors using a clinical decision-reasoning framework. BMJ Open 2024;14(3):e080558 View
  44. Le M, Davis M. ChatGPT Yields a Passing Score on a Pediatric Board Preparatory Exam but Raises Red Flags. Global Pediatric Health 2024;11 View
  45. Peled T, Sela H, Weiss A, Grisaru‐Granovsky S, Agrawal S, Rottenstreich M. Evaluating the validity of ChatGPT responses on common obstetric issues: Potential clinical applications and implications. International Journal of Gynecology & Obstetrics 2024;166(3):1127 View
  46. Zhang Y, Xu L, Ji H. Author's reply: AI in medicine, bridging the chasm between potential and capability. Digestive and Liver Disease 2024;56(6):1116 View
  47. Laymouna M, Ma Y, Lessard D, Schuster T, Engler K, Lebouché B. Roles, Users, Benefits, and Limitations of Chatbots in Health Care: Rapid Review. Journal of Medical Internet Research 2024;26:e56930 View
  48. Katz U, Cohen E, Shachar E, Somer J, Fink A, Morse E, Shreiber B, Wolf I. GPT versus Resident Physicians — A Benchmark Based on Official Board Scores. NEJM AI 2024;1(5) View
  49. Zhuo K, Kim P, Kovacic J, Chalasani V, Rasiah K, Menogue S, Chung A. Can Artificial Intelligence Treat My Urinary Tract Infections?—Evaluation of Health Information Provided by OpenAI™ ChatGPT on Urinary Tract Infections. Société Internationale d’Urologie Journal 2024;5(2):104 View
  50. Tripathi S, Sukumaran R, Dheer S, Cook T. Promptwise: Prompt Engineering Paradigm for Enhanced Patient-Large Language Model Interactions Towards Medical Education. SSRN Electronic Journal 2024 View
  51. Nassiri K, Akhloufi M. Recent Advances in Large Language Models for Healthcare. BioMedInformatics 2024;4(2):1097 View
  52. Thirunavukarasu A, Mahmood S, Malem A, Foster W, Sanghera R, Hassan R, Zhou S, Wong S, Wong Y, Chong Y, Shakeel A, Chang Y, Tan B, Jain N, Tan T, Rauz S, Ting D, Ting D, Luo M. Large language models approach expert-level clinical knowledge and reasoning in ophthalmology: A head-to-head cross-sectional study. PLOS Digital Health 2024;3(4):e0000341 View
  53. Tessler I, Wolfovitz A, Alon E, Gecel N, Livneh N, Zimlichman E, Klang E. ChatGPT’s adherence to otolaryngology clinical practice guidelines. European Archives of Oto-Rhino-Laryngology 2024;281(7):3829 View
  54. Varghese C, Harrison E, O’Grady G, Topol E. Artificial intelligence in surgery. Nature Medicine 2024;30(5):1257 View
  55. Cong-Lem N, Soyoof A, Tsering D. A Systematic Review of the Limitations and Associated Opportunities of ChatGPT. International Journal of Human–Computer Interaction 2024:1 View
  56. Scott I, Zuccon G. The new paradigm in machine learning – foundation models, large language models and beyond: a primer for physicians. Internal Medicine Journal 2024;54(5):705 View
  57. Bonnechère B. Unlocking the Black Box? A Comprehensive Exploration of Large Language Models in Rehabilitation. American Journal of Physical Medicine & Rehabilitation 2024 View
  58. Ozden I, Gokyar M, Ozden M, Sazak Ovecoglu H. Assessment of artificial intelligence applications in responding to dental trauma. Dental Traumatology 2024;40(6):722 View
  59. Duggan R, Tsuruda K. ChatGPT performance on radiation technologist and therapist entry to practice exams. Journal of Medical Imaging and Radiation Sciences 2024;55(4):101426 View
  60. Mousavi M, Shafiee S, Harley J, Cheung J, Abbasgholizadeh Rahimi S. Performance of generative pre-trained transformers (GPTs) in Certification Examination of the College of Family Physicians of Canada. Family Medicine and Community Health 2024;12(Suppl 1):e002626 View
  61. Şan H, Bayrakcı Ö, Çağdaş B, Serdengeçti M, Alagöz E. Reliability and readability analysis of ChatGPT-4 and Google Bard as a patient information source for the most commonly applied radionuclide treatments in cancer patients. Revista Española de Medicina Nuclear e Imagen Molecular (English Edition) 2024;43(4):500021 View
  62. Suwała S, Szulc P, Guzowski C, Kamińska B, Dorobiała J, Wojciechowska K, Berska M, Kubicka O, Kosturkiewicz O, Kosztulska B, Rajewska A, Junik R. ChatGPT-3.5 passes Poland’s medical final examination—Is it possible for ChatGPT to become a doctor in Poland?. SAGE Open Medicine 2024;12 View
  63. Ong J, Chang S, William W, Butte A, Shah N, Chew L, Liu N, Doshi-Velez F, Lu W, Savulescu J, Ting D. Medical Ethics of Large Language Models in Medicine. NEJM AI 2024;1(7) View
  64. Hager P, Jungmann F, Holland R, Bhagat K, Hubrecht I, Knauer M, Vielhauer J, Makowski M, Braren R, Kaissis G, Rueckert D. Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nature Medicine 2024;30(9):2613 View
  65. Lucas M, Yang J, Pomeroy J, Yang C. Reasoning with large language models for medical question answering. Journal of the American Medical Informatics Association 2024;31(9):1964 View
  66. Şan H, Bayrakçi Ö, Çağdaş B, Serdengeçti M, Alagöz E. Análisis de confiabilidad y lectibilidad de ChatGPT-4 y Google Gard como fuente de información del paciente para los tratamientos con radionúclidos más comúnmente aplicados en pacientes con cáncer. Revista Española de Medicina Nuclear e Imagen Molecular 2024;43(4):500021 View
  67. Chow J, Cheng T, Chien T, Chou W. Assessing ChatGPT’s Capability for Multiple Choice Questions Using RaschOnline: Observational Study. JMIR Formative Research 2024;8:e46800 View
  68. Moglia A, Georgiou K, Cerveri P, Mainardi L, Satava R, Cuschieri A. Large language models in healthcare: from a systematic review on medical examinations to a comparative analysis on fundamentals of robotic surgery online test. Artificial Intelligence Review 2024;57(9) View
  69. Fatima A, Shafique M, Alam K, Fadlalla Ahmed T, Mustafa M. ChatGPT in medicine: A cross-disciplinary systematic review of ChatGPT’s (artificial intelligence) role in research, clinical practice, education, and patient interaction. Medicine 2024;103(32):e39250 View
  70. Casey J, Dworkin M, Winschel J, Molino J, Daher M, Katarincic J, Gil J, Akelman E. ChatGPT: A concise Google alternative for people seeking accurate and comprehensive carpal tunnel syndrome information. Hand Surgery and Rehabilitation 2024;43(5):101757 View
  71. Heinke A, Radgoudarzi N, Huang B, Baxter S. A review of ophthalmology education in the era of generative artificial intelligence. Asia-Pacific Journal of Ophthalmology 2024;13(4):100089 View
  72. Goodings A, Kajitani S, Chhor A, Albakri A, Pastrak M, Kodancha M, Ives R, Lee Y, Kajitani K. Assessment of ChatGPT-4 in Family Medicine Board Examinations Using Advanced AI Learning and Analytical Methods: Observational Study. JMIR Medical Education 2024;10:e56128 View
  73. Cherrez-Ojeda I, Gallardo-Bastidas J, Robles-Velasco K, Osorio M, Velez Leon E, Leon Velastegui M, Pauletto P, Aguilar-Díaz F, Squassi A, González Eras S, Cordero Carrasco E, Chavez Gonzalez K, Calderon J, Bousquet J, Bedbrook A, Faytong-Haro M. Understanding Health Care Students’ Perceptions, Beliefs, and Attitudes Toward AI-Powered Language Models: Cross-Sectional Study. JMIR Medical Education 2024;10:e51757 View
  74. Pan G, Ni J. A cross sectional investigation of ChatGPT-like large language models application among medical students in China. BMC Medical Education 2024;24(1) View
  75. Rodríguez Weber F, Portela Ortiz J, Enríquez Barajas A. La inteligencia artificial (IA) en la medicina y su aprendizaje. Acta Médica Grupo Ángeles 2024;22(3):261 View
  76. Wang L, Wan Z, Ni C, Song Q, Li Y, Clayton E, Malin B, Yin Z. Applications and Concerns of ChatGPT and Other Conversational Large Language Models in Health Care: Systematic Review. Journal of Medical Internet Research 2024;26:e22769 View
  77. Ros-Arlanzón P, Perez-Sempere A. Evaluating AI Competence in Specialized Medicine: A Comparative Analysis of ChatGPT and Neurologists in a Neurology Specialist exam in Spain (Preprint). JMIR Medical Education 2024 View
  78. Ronquillo J, South B, Wiemken T, Jadhav A, Watt S, De Jesus M, Habtezion A. AI for Oncology Drug Data Harmonization — Amazon versus OpenAI. NEJM AI 2024;1(11) View
  79. Attanasio M, Mazza M, Le Donne I, Masedu F, Greco M, Valenti M. Does ChatGPT have a typical or atypical theory of mind?. Frontiers in Psychology 2024;15 View
  80. Khabaz K, Newman‐Hung N, Kallini J, Kendal J, Christ A, Bernthal N, Wessel L. Assessment of Artificial Intelligence Chatbot Responses to Common Patient Questions on Bone Sarcoma. Journal of Surgical Oncology 2024 View
  81. Holt N, Byrne M. The Role of Artificial Intelligence and Big Data for Gastrointestinal Disease. Gastrointestinal Endoscopy Clinics of North America 2024 View
  82. Liu F, Chang X, Zhu Q, Huang Y, Li Y, Wang H. Assessing clinical medicine students’ acceptance of large language model: based on technology acceptance model. BMC Medical Education 2024;24(1) View
  83. Masison J, Lehmann H, Wan J. Utilization of Computable Phenotypes in Electronic Health Record Research: A Review and Case Study in Atopic Dermatitis. Journal of Investigative Dermatology 2024 View
  84. Hassan M, Ayad M, Nembhard C, Hayes-Dixon A, Lin A, Janjua M, Franko J, Tee M. Artificial Intelligence Compared to Manual Selection of Prospective Surgical Residents. Journal of Surgical Education 2025;82(1):103308 View
  85. Kovari A. AI for Decision Support: Balancing Accuracy, Transparency, and Trust Across Sectors. Information 2024;15(11):725 View

Books/Policy Documents

  1. Fernández-Pichel M, Losada D, Pichel J. Computational Science – ICCS 2024. View
  2. Sanmukh S, Krzykawska-Serda M, Dragan P, Baron S, Lobaccaro J, Latek D. . View
  3. Di Ieva A, Stewart C, Suero Molina E. Computational Neurosurgery. View