Back to all posts

Search Engines Beat AI For Reliable Health Information

2025-06-23Why publish in Cureus? Click below to find out.4 minutes read
Health Information
AI Chatbots
Search Engines

The internet has become an indispensable tool for individuals seeking health information, profoundly influencing patient understanding, decision-making, and overall public health outcomes. However, the quality and readability of online health content can vary dramatically. With the recent surge in generative artificial intelligence (AI) tools like ChatGPT and Gemini, we face new challenges and opportunities in how medical information is accessed and interpreted by the public.

This new landscape prompted a critical investigation into the reliability of these emerging AI platforms compared to established traditional search engines.

Pitting Search Engines Against AI Chatbots The Study

A recent study aimed to rigorously evaluate and compare the quality, credibility, and readability of consumer health information concerning rosacea. The research pitted traditional search engines, Google and Bing, against popular generative AI platforms, ChatGPT (specifically GPT-4) and Gemini.

The core objective was to determine which sources provide the most dependable health information, using a suite of three validated assessment instruments: the DISCERN tool, the JAMA Benchmark Criteria, and Flesch-Kincaid Readability Metrics.

How Information Quality Was Measured

Researchers collected twenty health-related webpages from each platform—Google, Bing, Gemini, and ChatGPT—using a standardized query related to rosacea treatment options. To ensure objectivity, each piece of information was independently assessed by two reviewers.

Quality and credibility were evaluated using:

  • DISCERN instrument: This tool assesses the quality of written information on treatment choices.
  • Adapted JAMA benchmark criteria: These criteria evaluate the credibility of health information, focusing on aspects like authorship, attribution, disclosure, and currency of information.

Readability was measured using:

  • Flesch Reading Ease scores: Indicates how easy text is to read.
  • Flesch-Kincaid Grade Level scores: Estimates the US school grade level needed to understand the text.

Statistical analysis, including one-way ANOVA with Bonferroni correction, was employed to compare the performance of the different platforms. Cohen’s Kappa was used to measure the consistency of ratings between the two independent reviewers.

The Verdict Google Leads AI Chatbots Trail Behind

The study's findings revealed a clear hierarchy in the quality and credibility of health information. Google emerged as the top performer, achieving the highest mean scores for both quality (DISCERN score: 3.33 ± 0.53) and credibility (JAMA score: 3.70 ± 0.44).

Following Google was Bing, then Gemini. Notably, ChatGPT received the lowest scores across all quality and credibility measures. This suggests that while AI chatbots are rapidly gaining popularity, they currently lag significantly behind traditional search engines, particularly Google, in providing reliable health information.

Reading Between The Lines Readability Concerns

When it came to readability, the analysis showed no statistically significant differences between the four platforms. However, a crucial finding was that all content, regardless of the source, exceeded the recommended reading levels for public health information. This indicates that health information from both search engines and AI chatbots is often too complex for a significant portion of the public to easily understand.

On a positive note for the study's methodology, the Cohen’s Kappa statistic indicated strong inter-rater agreement across the DISCERN items, bolstering the reliability of the quality assessments.

The Path Forward Ensuring Trustworthy AI Health Content

The conclusions drawn from this research are significant. Google remains the most reliable source of high-quality, readable health information among the platforms evaluated. Generative AI tools like ChatGPT and Gemini, despite their increasing adoption, demonstrated considerable limitations concerning the accuracy, transparency, and complexity of the health information they provide.

These findings underscore an urgent need for several actions:

  • Improved Oversight: Mechanisms may be needed to monitor and regulate AI-generated health content.
  • Enhanced Transparency: AI tools should be more transparent about how they generate information and the sources they use.
  • User Education: The public needs to be educated on how to critically evaluate AI-generated health content and understand its potential limitations.

As AI continues to evolve, ongoing research and vigilance will be essential to ensure that patients can access health information that is not only readily available but also accurate, credible, and understandable.

About Channels

Unlock discounted publishing that highlights your organization and the peer-reviewed research and clinical experiences it produces.

arrow right icon Learn more

Academic Channels Guide

Find out how channels are organized and operated, including details on the roles and responsibilities of channel editors.

arrow right icon Learn more

Read Original Post
ImaginePro newsletter

Subscribe to our newsletter!

Subscribe to our newsletter to get the latest news and designs.