Home / Health / Relying on AI for health advice? Study finds 80% early diagnoses go wrong

Relying on AI for health advice? Study finds 80% early diagnoses go wrong

New research highlights serious gaps in AI-led diagnosis, with chatbots missing most early cases but performing better as clinical details increase

artificial intelligence, Customer Service, Chatbots in customer service, Chatbot help, Technology

New findings raise concerns about the reliability of AI tools in assessing symptoms during the early stages of illness. (Photo: Adobestock)

Sarjna Rai New Delhi

4 min read Last Updated : Apr 14 2026 | 12:40 PM IST

Listen to This Article

Artificial intelligence (AI) may be fast becoming a go-to for quick health advice, yet a new study has raised serious concerns about its reliability. While many users turn to chatbots for early symptom checks, researchers now say that these tools frequently get it wrong, especially in the crucial early stages of diagnosis, and that gap could directly affect patient outcomes.

The study, titled “Large Language Model Performance and Clinical Reasoning Tasks”, published in JAMA Network Open, found that AI chatbots misdiagnosed medical conditions in over 80 per cent of early clinical cases. The findings highlight a growing disconnect between AI’s promise and its real-world clinical performance.

What the study found

The research assessed 21 large language models across 29 clinical case scenarios, generating a total of 16,254 diagnostic responses. The study evaluated several leading large language models, including systems from OpenAI, Google, Meta and Anthropic, such as GPT-4, Gemini, Grok and Claude, among others.

Unlike structured tests, the models were evaluated on open-ended diagnostic reasoning, which better reflects real-world use.

AI systems failed to provide the correct diagnosis in over 80 per cent of early-stage cases, where symptoms are often non-specific
Even when the correct answer appeared, it was often not ranked as the most likely diagnosis, reducing its practical value
Failure rates fell to less than 40 per cent for final diagnoses with more complete data, with the best-performing models exceeding 90 per cent accuracy
The study reinforced that AI tends to generate multiple possibilities, but struggles to prioritise the most accurate one

This is particularly concerning because early-stage diagnosis is often when timely intervention matters most.

Also Read

A snapshot of FD rates offered by small finance banks and others

Mid-April fixed deposit rates at 6-8.1%: Check best offers across banks

NRI term plans surge 35% amid West Asia tensions, demand doubles: Report

Great PE reset: India's deal market is moving away from billion-dollar bets

H-1B visa filings drop 10% across US banks in FY26 as policy changes bite

Q4 result today: ICICI Prudential, Den Networks among 8 firms on April 14

“These models are great at naming a final diagnosis once the data is complete, but they struggle at the open-ended start of a case, when there isn’t much information,” said the study’s lead author, Arya Rao, who is also a researcher at the Massachusetts-based Mass General Brigham healthcare system.

Why AI struggles with early diagnosis

Experts say the issue lies in how these systems generate responses, because, unlike doctors, AI does not reason clinically but predicts patterns based on vast datasets. As a result, it may produce answers that sound convincing but are not medically sound, and it often struggles when symptoms are vague or evolving.

This limitation becomes more pronounced in early-stage illness, where human doctors rely on intuition, experience and patient context, while AI cannot ask meaningful follow-up questions or reassess its conclusions dynamically.

The findings underline a consistent pattern seen in earlier research, where human doctors outperform AI in diagnostic accuracy, especially in complex or uncertain cases. While AI can assist with information retrieval, it still lacks the layered thinking required in medicine and therefore cannot replicate clinical judgment.

The risks for patients

The implications extend beyond research settings, because increasing reliance on AI for self-diagnosis could lead to real-world harm.

Misdiagnosis in early stages may lead to delayed treatment and disease progression
Incorrect suggestions can trigger unnecessary anxiety or false reassurance
Overestimation of serious conditions may result in avoidable tests and consultations
Confident but incorrect responses can make users trust flawed advice without verification

As more people turn to AI for health queries, these risks become harder to ignore.

Are chatbots still useful?

Despite these concerns, researchers emphasise using Ai carefully. Chatbots can still support patients by offering general awareness and helping structure conversations with doctors, they should not be relied upon for diagnosis or treatment decisions.

While the technology is evolving, it still falls short of replacing human expertise, and until that changes, medical advice should always be confirmed by a qualified professional.

For more health updates, follow #HealthwithBS

This report is for informational purposes only and is not a substitute for professional medical advice.