AI chatbots ‘not safe for medical advice’, Suffolk experts warn as study finds widespread inaccuracies

A new study found that around half of A.I's responses to health questions are inaccurate or misleading.

Professor Nicholas Caldwell
Author: Jasmine OakPublished 20th Apr 2026

Chatbots such as ChatGPT and Grok are not safe to rely on for medical advice, experts have warned, after new research found significant levels of inaccurate and misleading information.

The study

A study published in the journal BMJ Open found that around half of responses to medical questions were “problematic”, raising concerns about the growing use of artificial intelligence in health-related decision-making.

Researchers analysed answers from five major AI chatbots across 50 medical questions, covering topics including cancer, vaccines, nutrition and chronic conditions.

The findings showed that Grok produced problematic responses in 58% of cases, followed by ChatGPT at 52% and Meta AI at 50%.

The study concluded that chatbots can “hallucinate, generating convincing but incorrect or incomplete answers, due to limitations in their training data and design.

"Think of it as a medical textbook you can talk to, not a doctor who can treat you."

Professor Nicholas Caldwell, Director of the Digital Futures Institute at the University of Suffolk, said the findings reflect a wider concern about how AI is being used by the public.

He warned that while chatbots can appear confident and authoritative, they are fundamentally not equipped to provide safe medical guidance.

“AI is not a doctor,” he said. “It doesn’t know what it’s doing in the way a trained professional does, and it shouldn’t be relied upon for medical decisions.”

He added that responses generated by AI systems are effectively based on probability rather than understanding.

“It’s mathematical pattern matching, in effect, a sophisticated form of rolling dice. You might get something reasonable, but who wants to rely on luck when it comes to their health?”

Professor Caldwell also warned that the technology can produce answers that sound plausible, even when they are incorrect, increasing the risk that users may trust inaccurate information.

"It's the same way that you shouldn't really be using Doctor Google to self-diagnose, don't be using Doctor Chatbot to self-diagnose, but it may help to learn about the thing that could be affecting you. For example, asking it what questions you could ask your GP or your pharmacist, so as you're not rocking up (to the appointment) going, oh, what do I do? What is this?

False information

The study also found that citations provided by AI tools were often incomplete or fabricated, with previous research suggesting only around a third of references generated by similar systems are fully accurate.

Experts warned that the increasing integration of AI into healthcare settings must be approached with caution, adding that such tools are not licensed to provide medical advice and may lack up-to-date clinical knowledge.

They called for stronger public education, professional training and regulatory oversight to ensure AI supports, rather than undermines, public health.

The developers of ChatGPT and Grok have been contacted for comment.

First for all the latest news from across the UK every hour on Hits Radio on DAB, at hitsradio.co.uk and on the Rayo app.