Many patients and their family members consult with “Dr. Google” and other search engines to better understand their diagnoses and treatment options. Artificial intelligence (AI) tools such as machine-learning and large language models have fundamentally changed the information that patients can obtain online.
A new study presented at the AAOS 2025 Annual Meeting assessed the ability of AI tools to provide accurate and reliable information regarding surgical options for carpal tunnel syndrome. The researchers found that AI tools answered patients’ questions significantly more accurately than generic web searches. The most accurate information was derived from academic and nonprofit websites, underscoring the importance of reliable sources to inform AI responses.
“The integration of AI in patient education and care is transforming how patients access and process medical information. With the proliferation of machine learning tools and large language models, there is a critical need to evaluate whether these tools can deliver accurate, reliable, and comprehensive information,” presenting author Sree M. Vemu, MD, orthopaedic surgery resident at Houston Methodist, told AAOS Now.
He and colleagues designed a study to compare the accuracy and comprehensiveness of AI-generated information about carpal tunnel surgery versus data found via traditional search engine. The researchers compared ChatGPT and Google Bard (now known as Google Gemini) against Google, the predominant search engine in the United States.
Using the initial search term “carpal tunnel release,” the researchers collected the top 10 questions and answers from Google’s “People also ask” tool, such as, “What happens during a carpal tunnel release?” Repetitive or irrelevant questions were excluded. They conducted a similar query on ChatGPT (versions 3.5 and 4, with WebChat GPT and KeyMate AI plugins) and Google Bard. The researchers verified sources when available. (Google Search and ChatGPT 4 provided sources, but ChatGPT 3.5 and Google Bard did not.)
Subsequently, two board-certified orthopaedic hand surgeons, who were blinded to the sources of information, examined the content for accuracy. They graded content using the following scale: 1 (incorrect), 2 (mixed correct and incorrect), 3 (correct but not comprehensive), or 4 (comprehensive and correct). The study then collapsed that scale into two categories: inaccurate (grade 1 or 2) and accurate (3 or 4).
Overall, the analysis showed:
- Only one Google search answer was accurate (10 percent).
- ChatGPT 3.5 was 70 percent accurate.
- ChatGPT 4 with Webchat was 70 percent accurate.
- ChatGPT with KeyMate AI was 100 percent accurate.
- Google Bard was 90 percent accurate.
- Content with a grade of 4 from both surgeons was considered comprehensive:
- No Google search or Google Bard answers met that standard.
- Twenty percent of ChatGPT 3.5 content was deemed comprehensive.
- Twenty percent of ChatGPT4 content from Webchat qualified.
- Half of ChatGPT with KeyMate AI information was deemed comprehensive by both surgeons.
The authors concluded that AI tools delivered substantial knowledge of carpal tunnel release, better than traditional Google search.
“The ability of AI to provide accurate information was anticipated, given its training on vast datasets that include high-quality medical literature. Existing web search methods often lack consistency in accuracy, and their sources may not always be adequately vetted for medical reliability. This was concerning considering their ubiquitous role as a primary information source for many patients,” Dr. Vemu said.
He plans further research to assess the impact of AI tools on patient decision making and outcomes. He also plans to explore methods to enhance the reliability of AI tools, including partnerships with medical organizations to integrate curated, evidence-based content into AI training data.
“Orthopaedic surgeons can encourage collaboration between healthcare organizations and AI developers to establish standards for accuracy, reliability, and source transparency in medical AI tools,” Dr. Vemu concluded. “This research does not advocate replacing physician-patient interactions but highlights AI as a supplementary tool to enhance patient understanding and support shared decision making.”
Dr. Vemu’s coauthors of “Carpal Tunnel Surgery Information: Comparison of AI Generated Information with Google Search for Common Patient Questions” are Brian Phelps, MD; Chia Heng Wu, MD, MBA, FAAOS; and Shari Liberman, MD.
Keightley Amen, BA, ELS, is a freelance writer for AAOS Now.