Published on in Vol 11 (2025)
Preprints (earlier versions) of this paper are
available at
https://preprints.jmir.org/preprint/58898, first published
.
Journals
- Agarwal M, Sharma P, Wani P. Evaluating the Accuracy and Reliability of Large Language Models (ChatGPT, Claude, DeepSeek, Gemini, Grok, and Le Chat) in Answering Item-Analyzed Multiple-Choice Questions on Blood Physiology. Cureus 2025 View
- Shean R, Shah T, Pandiarajan A, Tang A, Bolo K, Nguyen V, Xu B. A comparative analysis of DeepSeek R1, DeepSeek-R1-Lite, OpenAi o1 Pro, and Grok 3 performance on ophthalmology board-style questions. Scientific Reports 2025;15(1) View