Developer Offer

Try ImaginePro API with 50 Free Credits

Build and ship AI-powered visuals with Midjourney, Flux, and more — free credits refresh every month.

How Accurate Is ChatGPT On Celiac Disease Info

2025-08-15•Rostami-Nejad, Mohammad•5 minutes read

Healthcare

Celiac Disease

The Growing Role of AI in Healthcare

Artificial intelligence (AI) is rapidly changing the healthcare landscape, offering tools for everything from diagnosing diseases to optimizing treatments. A major part of this revolution is AI-powered chatbots, like ChatGPT, which use complex algorithms to understand human language and provide detailed, human-like responses. As more people turn to the internet for health advice, these tools are becoming a primary source of information.

For those with chronic conditions like Celiac Disease (CD), online resources are especially vital. CD is a complex autoimmune disorder triggered by gluten, requiring a strict and often challenging lifelong diet. Patients frequently search online for guidance on managing their condition. While AI chatbots offer a convenient source of information, it's crucial to verify their accuracy and reliability to prevent the spread of harmful misinformation.

Putting ChatGPT to the Test on Celiac Disease

To determine if ChatGPT is a trustworthy resource, researchers designed a study to evaluate its performance on Celiac Disease-related questions. They selected the widely accessible ChatGPT-3.5 model and compiled a list of 20 frequently asked questions (FAQs) that cover the essential aspects of the disease.

These questions were grouped into three main categories:

Definition and Causes: Basic questions about what CD is and why it occurs.
Diagnosis: How doctors identify and confirm the disease.
Treatment: Strategies for managing CD, primarily through a gluten-free diet.

To test for consistency, each of the 20 questions was asked in three separate sessions. The 60 total responses were then independently evaluated by two leading experts in Celiac Disease: a medical immunologist and a gastroenterologist. Each expert rated the responses on a 5-point scale, where 1 meant 'Strongly Disagree' (completely incorrect) and 5 meant 'Strongly Agree' (accurate and comprehensive).

The Accuracy Scorecard: How Did ChatGPT Perform?

Overall, ChatGPT performed remarkably well. The experts' combined scores for its answers were consistently high, mostly ranging between 4 (Agree) and 5 (Strongly Agree). This indicates that the AI provided information that was largely accurate and comprehensive.

Notably, ChatGPT excelled in the 'Treatment' category. Questions about practical management, such as "How can people with CD manage their symptoms while traveling or eating out?", received perfect scores. This highlights its potential as a useful tool for day-to-day living with the disease.

Its lowest scores, though still high, were for questions about the causes of CD. In these cases, the experts felt the answers were correct but could have been more detailed or were missing some nuance.

Average expert scores for ChatGPT responses, broken down by Celiac Disease management categories like Definition, Diagnosis, and Treatment.

Expert Agreement and Reliability Insights

While the overall accuracy was high, the study also looked at reliability—both between the two experts (inter-rater) and within each expert's own scoring (intra-rater).

The agreement between the two experts was only fair. Although both consistently gave high scores, they sometimes differed in their exact ratings. For example, one expert might rate an answer a 4 while the other gave it a 5. This suggests that even among specialists, there can be subtle differences in evaluating the completeness of an answer.

More interestingly, one expert showed high consistency in their scoring across ChatGPT's three different answers to the same question. The other expert's scores, however, varied significantly. This reveals two important things: human evaluation has a subjective element, and ChatGPT itself can provide slightly different answers to the same prompt, which can influence expert perception.

What This Means for Celiac Disease Patients

The study's findings are encouraging and align with other research showing that ChatGPT can be a powerful tool in various medical fields. For individuals with Celiac Disease, ChatGPT-3.5 appears to be a valuable and generally accurate source of information, especially for practical advice on managing a gluten-free lifestyle.

However, the inconsistencies are a crucial reminder that AI is a supplement, not a substitute, for professional medical advice. The slight variations in answers and the subjective nature of expert evaluation mean that patients should use these tools as a starting point for their own research and for discussions with their healthcare providers.

Limitations and Future Directions

The researchers acknowledged some limitations. The study used only 20 questions, which might not cover all patient concerns. Additionally, the questions were generated by ChatGPT itself, which could have introduced a bias. The evaluation was also limited to two experts, and a dietitian's perspective could have added more nuance to the assessment of dietary advice.

Despite these limitations, the study provides a strong foundation. As AI technology evolves with newer models like ChatGPT-4.0, ongoing research will be essential to track improvements in accuracy and reliability. The ultimate goal is to safely integrate these powerful AI tools into healthcare to improve patient education and support.

The Final Verdict

ChatGPT shows significant potential as a reliable resource for Celiac Disease patients. It provides accurate, comprehensive answers, particularly on how to manage the disease day-to-day. While users should be aware of potential inconsistencies in its responses, it stands as a promising tool to supplement information from doctors and dietitians, empowering patients to better understand and manage their condition.

Read Original Post

Compare Plans & Pricing

Find the plan that matches your workload and unlock full access to ImaginePro.

ImaginePro pricing comparison
Plan	Price	Highlights
Standard	$8 / month	300 monthly credits included Access to Midjourney, Flux, and SDXL models Commercial usage rights
Premium	$20 / month	900 monthly credits for scaling teams Higher concurrency and faster delivery Priority support via Slack or Telegram