Published on in Vol 9 (2023)

This is a member publication of University of Cambridge (Jisc)

Preprints (earlier versions) of this paper are available at, first published .
Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care

Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care

Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care


  1. Wosny M, Strasser L, Hastings J. Experience of Health Care Professionals Using Digital Tools in the Hospital: Qualitative Systematic Review. JMIR Human Factors 2023;10:e50357 View
  2. Sallam M, Salim N, Barakat M, Al-Mahzoum K, Al-Tammemi A, Malaeb D, Hallit R, Hallit S. Assessing Health Students' Attitudes and Usage of ChatGPT in Jordan: Validation Study. JMIR Medical Education 2023;9:e48254 View
  3. Fraser H, Crossland D, Bacher I, Ranney M, Madsen T, Hilliard R. Comparison of Diagnostic and Triage Accuracy of Ada Health and WebMD Symptom Checkers, ChatGPT, and Physicians for Patients in an Emergency Department: Clinical Data Analysis Study. JMIR mHealth and uHealth 2023;11:e49995 View
  4. Takagi S, Watari T, Erabi A, Sakaguchi K. Performance of GPT-3.5 and GPT-4 on the Japanese Medical Licensing Examination: Comparison Study. JMIR Medical Education 2023;9:e48002 View
  5. Borchert R, Hickman C, Pepys J, Sadler T. Performance of ChatGPT on the Situational Judgement Test—A Professional Dilemmas–Based Examination for Doctors in the United Kingdom. JMIR Medical Education 2023;9:e48978 View
  6. Miao J, Thongprayoon C, Garcia Valencia O, Krisanapan P, Sheikh M, Davis P, Mekraksakit P, Suarez M, Craici I, Cheungpasitporn W. Performance of ChatGPT on Nephrology Test Questions. Clinical Journal of the American Society of Nephrology 2024;19(1):35 View
  7. Corti C, Castellano G, Curigliano G. Exploring the utility and limitations of ChatGPT in scientific literature searches. ESMO Real World Data and Digital Oncology 2023;1:100001 View
  8. Thirunavukarasu A. Large language models will not replace healthcare professionals: curbing popular fears and hype. Journal of the Royal Society of Medicine 2023;116(5):181 View
  9. Abd-alrazaq A, AlSaad R, Alhuwail D, Ahmed A, Healy P, Latifi S, Aziz S, Damseh R, Alabed Alrazak S, Sheikh J. Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions. JMIR Medical Education 2023;9:e48291 View
  10. Thibaut G, Dabbagh A, Liverneaux P. Does Google’s Bard Chatbot perform better than ChatGPT on the European hand surgery exam?. International Orthopaedics 2024;48(1):151 View
  11. Morita P, Abhari S, Kaur J, Lotto M, Miranda P, Oetomo A. Applying ChatGPT in public health: a SWOT and PESTLE analysis. Frontiers in Public Health 2023;11 View
  12. Tan T, Thirunavukarasu A, Jin L, Lim J, Poh S, Teo Z, Ang M, Chan R, Ong J, Turner A, Karlström J, Wong T, Stern J, Ting D. Artificial intelligence and digital health in global eye health: opportunities and challenges. The Lancet Global Health 2023;11(9):e1432 View
  13. Flores-Cohaila J, García-Vicente A, Vizcarra-Jiménez S, De la Cruz-Galán J, Gutiérrez-Arratia J, Quiroga Torres B, Taype-Rondan A. Performance of ChatGPT on the Peruvian National Licensing Medical Examination: Cross-Sectional Study. JMIR Medical Education 2023;9:e48039 View
  14. Chakraborty C, Pal S, Bhattacharya M, Dash S, Lee S. Overview of Chatbots with special emphasis on artificial intelligence-enabled ChatGPT in medical science. Frontiers in Artificial Intelligence 2023;6 View
  15. Krusche M, Callhoff J, Knitza J, Ruffer N. Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4. Rheumatology International 2023;44(2):303 View
  16. Tan T, Thirunavukarasu A, Campbell J, Keane P, Pasquale L, Abramoff M, Kalpathy-Cramer J, Lum F, Kim J, Baxter S, Ting D. Generative Artificial Intelligence Through ChatGPT and Other Large Language Models in Ophthalmology. Ophthalmology Science 2023;3(4):100394 View
  17. Traoré S, Goetsch T, Muller B, Dabbagh A, Liverneaux P. Is ChatGPT able to pass the first part of the European Board of Hand Surgery diploma examination?. Hand Surgery and Rehabilitation 2023;42(4):362 View
  18. Abi-Rafeh J, Xu H, Kazan R, Tevlin R, Furnas H. Large Language Models and Artificial Intelligence: A Primer for Plastic Surgeons on the Demonstrated and Potential Applications, Promises, and Limitations of ChatGPT. Aesthetic Surgery Journal 2024;44(3):329 View
  19. Eguia H, Sanz García J. Inteligencia artificial, ChatGPT y atención primaria. Medicina de Familia. SEMERGEN 2023;49(7):102069 View
  20. Thirunavukarasu A, Ting D, Elangovan K, Gutierrez L, Tan T, Ting D. Large language models in medicine. Nature Medicine 2023;29(8):1930 View
  21. Yang R, Tan T, Lu W, Thirunavukarasu A, Ting D, Liu N. Large language models in health care: Development, applications, and challenges. Health Care Science 2023;2(4):255 View
  22. Suárez A, Díaz‐Flores García V, Algar J, Gómez Sánchez M, Llorente de Pedro M, Freire Y. Unveiling the ChatGPT phenomenon: Evaluating the consistency and accuracy of endodontic question answers. International Endodontic Journal 2024;57(1):108 View
  23. Ting D, Tan T, Ting D. ChatGPT in ophthalmology: the dawn of a new era?. Eye 2024;38(1):4 View
  24. Ng F, Thirunavukarasu A, Cheng H, Tan T, Gutierrez L, Lan Y, Ong J, Chong Y, Ngiam K, Ho D, Wong T, Kwek K, Doshi-Velez F, Lucey C, Coffman T, Ting D. Artificial intelligence education: An evidence-based medicine approach for consumers, translators, and developers. Cell Reports Medicine 2023;4(10):101230 View
  25. Miao J, Thongprayoon C, Suppadungsuk S, Garcia Valencia O, Qureshi F, Cheungpasitporn W. Innovating Personalized Nephrology Care: Exploring the Potential Utilization of ChatGPT. Journal of Personalized Medicine 2023;13(12):1681 View
  26. Sahin M, Sozer A, Kuzucu P, Turkmen T, Sahin M, Sozer E, Tufek O, Nernekli K, Emmez H, Celtikci E. Beyond human in neurosurgical exams: ChatGPT's success in the Turkish neurosurgical society proficiency board exams. Computers in Biology and Medicine 2024;169:107807 View
  27. Suárez A, Jiménez J, Llorente de Pedro M, Andreu-Vázquez C, Díaz-Flores García V, Gómez Sánchez M, Freire Y. Beyond the Scalpel: Assessing ChatGPT's potential as an auxiliary intelligent virtual assistant in oral surgery. Computational and Structural Biotechnology Journal 2024;24:46 View
  28. Watari T, Takagi S, Sakaguchi K, Nishizaki Y, Shimizu T, Yamamoto Y, Tokuda Y. Performance Comparison of ChatGPT-4 and Japanese Medical Residents in the General Medicine In-Training Examination: Comparison Study. JMIR Medical Education 2023;9:e52202 View
  29. Thirunavukarasu A. How Can the Clinical Aptitude of AI Assistants Be Assayed?. Journal of Medical Internet Research 2023;25:e51603 View
  30. Civettini I, Zappaterra A, Granelli B, Rindone G, Aroldi A, Bonfanti S, Colombo F, Fedele M, Grillo G, Parma M, Perfetti P, Terruzzi E, Gambacorti‐Passerini C, Ramazzotti D, Cavalca F. Evaluating the performance of large language models in haematopoietic stem cell transplantation decision‐making. British Journal of Haematology 2024;204(4):1523 View
  31. Tangadulrat P, Sono S, Tangtrakulwanich B. Using ChatGPT for Clinical Practice and Medical Education: Cross-Sectional Survey of Medical Students’ and Physicians’ Perceptions. JMIR Medical Education 2023;9:e50658 View
  32. Kollitsch L, Eredics K, Marszalek M, Rauchenwald M, Brookman-May S, Burger M, Körner-Riffard K, May M. How does artificial intelligence master urological board examinations? A comparative analysis of different Large Language Models’ accuracy and reliability in the 2022 In-Service Assessment of the European Board of Urology. World Journal of Urology 2024;42(1) View
  33. Al-Sharif E, Penteado R, Dib El Jalbout N, Topilow N, Shoji M, Kikkawa D, Liu C, Korn B. Evaluating the Accuracy of ChatGPT and Google BARD in Fielding Oculoplastic Patient Queries: A Comparative Study on Artificial versus Human Intelligence. Ophthalmic Plastic & Reconstructive Surgery 2024;40(3):303 View
  34. Sezgin E. Redefining Virtual Assistants in Health Care: The Future With Large Language Models. Journal of Medical Internet Research 2024;26:e53225 View
  35. Sallam M, Barakat M, Sallam M. A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence–Based Models in Health Care Education and Practice: Development Study Involving a Literature Review. Interactive Journal of Medical Research 2024;13:e54704 View
  36. Chang Y, Wang X, Wang J, Wu Y, Yang L, Zhu K, Chen H, Yi X, Wang C, Wang Y, Ye W, Zhang Y, Chang Y, Yu P, Yang Q, Xie X. A Survey on Evaluation of Large Language Models. ACM Transactions on Intelligent Systems and Technology 2024;15(3):1 View
  37. Hatia A, Doldo T, Parrini S, Chisci E, Cipriani L, Montagna L, Lagana G, Guenza G, Agosta E, Vinjolli F, Hoxha M, D’Amelio C, Favaretto N, Chisci G. Accuracy and Completeness of ChatGPT-Generated Information on Interceptive Orthodontics: A Multicenter Collaborative Study. Journal of Clinical Medicine 2024;13(3):735 View
  38. Pereyra L, Schlottmann F, Steinberg L, Lasa J. Colorectal Cancer Prevention. Journal of Clinical Gastroenterology 2024 View
  39. Meyer A, Riese J, Streichert T. Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study. JMIR Medical Education 2024;10:e50965 View
  40. Su M, Lin L, Lin L, Chen Y. Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam. International Journal of Nursing Studies 2024;153:104717 View
  41. Mai D, Da C, Hanh N. The use of ChatGPT in teaching and learning: a systematic review through SWOT analysis approach. Frontiers in Education 2024;9 View
  42. Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis. Journal of Biomedical Informatics 2024;151:104620 View
  43. Maitland A, Fowkes R, Maitland S. Can ChatGPT pass the MRCP (UK) written examinations? Analysis of performance and errors using a clinical decision-reasoning framework. BMJ Open 2024;14(3):e080558 View
  44. Le M, Davis M. ChatGPT Yields a Passing Score on a Pediatric Board Preparatory Exam but Raises Red Flags. Global Pediatric Health 2024;11 View
  45. Peled T, Sela H, Weiss A, Grisaru‐Granovsky S, Agrawal S, Rottenstreich M. Evaluating the validity of ChatGPT responses on common obstetric issues: Potential clinical applications and implications. International Journal of Gynecology & Obstetrics 2024 View
  46. Zhang Y, Xu L, Ji H. Author's reply: AI in medicine, bridging the chasm between potential and capability. Digestive and Liver Disease 2024 View
  47. Laymouna M, Ma Y, Lessard D, Schuster T, Engler K, Lebouché B. Roles, Users, Benefits and Limitations of Chatbots in Healthcare: A Rapid Review (Preprint). Journal of Medical Internet Research 2024 View
  48. Katz U, Cohen E, Shachar E, Somer J, Fink A, Morse E, Shreiber B, Wolf I. GPT versus Resident Physicians — A Benchmark Based on Official Board Scores. NEJM AI 2024;1(5) View
  49. Zhuo K, Kim P, Kovacic J, Chalasani V, Rasiah K, Menogue S, Chung A. Can Artificial Intelligence Treat My Urinary Tract Infections?—Evaluation of Health Information Provided by OpenAI™ ChatGPT on Urinary Tract Infections. Société Internationale d’Urologie Journal 2024;5(2):104 View
  50. Tripathi S, Sukumaran R, Dheer S, Cook T. Promptwise: Prompt Engineering Paradigm for Enhanced Patient-Large Language Model Interactions Towards Medical Education. SSRN Electronic Journal 2024 View
  51. Nassiri K, Akhloufi M. Recent Advances in Large Language Models for Healthcare. BioMedInformatics 2024;4(2):1097 View
  52. Thirunavukarasu A, Mahmood S, Malem A, Foster W, Sanghera R, Hassan R, Zhou S, Wong S, Wong Y, Chong Y, Shakeel A, Chang Y, Tan B, Jain N, Tan T, Rauz S, Ting D, Ting D, Luo M. Large language models approach expert-level clinical knowledge and reasoning in ophthalmology: A head-to-head cross-sectional study. PLOS Digital Health 2024;3(4):e0000341 View
  53. Tessler I, Wolfovitz A, Alon E, Gecel N, Livneh N, Zimlichman E, Klang E. ChatGPT’s adherence to otolaryngology clinical practice guidelines. European Archives of Oto-Rhino-Laryngology 2024 View
  54. Varghese C, Harrison E, O’Grady G, Topol E. Artificial intelligence in surgery. Nature Medicine 2024 View
  55. Cong-Lem N, Soyoof A, Tsering D. A Systematic Review of the Limitations and Associated Opportunities of ChatGPT. International Journal of Human–Computer Interaction 2024:1 View
  56. Scott I, Zuccon G. The new paradigm in machine learning – foundation models, large language models and beyond: a primer for physicians. Internal Medicine Journal 2024;54(5):705 View
  57. Bonnechère B. Unlocking the Black Box? A Comprehensive Exploration of Large Language Models in Rehabilitation. American Journal of Physical Medicine & Rehabilitation 2024;103(6):532 View
  58. Ozden I, Gokyar M, Ozden M, Sazak Ovecoglu H. Assessment of artificial intelligence applications in responding to dental trauma. Dental Traumatology 2024 View