Dr. ChatGPT will see you now

Reddit poster

lived with a painful, clicking jaw for five years, as a result of a boxing accident. ChatGPT was able to help them fix the problem after they described it to ChatGPT. They had seen specialists, received MRIs but no one else could offer a solution. The AI chatbot suggested that a jaw-alignment problem might be the issue and offered a tongue placement technique as a possible treatment. The clicking stopped when the individual tried it. After five years of living with it, they In April, wrote on Reddit, “this AI gave a fix to me in a minute.” The story went viral with LinkedIn cofounder Reid Hoffman Sharingon X. It’s not just a one-off. Stories of patients allegedly getting accurate assessments by LLMs are flooding social media. MRI scans ( ) or x-rays.

Courtney Hofmann’s son has a neurological condition. ChatGPT was able to diagnose her son after 17 doctor visits in three years. It gave her an answer – tethered-cord syndrome, where the spine cord is unable to move freely due to tissue surrounding it – that she claims doctors treating her son missed. She told the New England Journal of Medicine that her son had undergone surgery six weeks after she used ChatGPT. “He is a brand new kid,” she said. podcast in November 2024.

Consumer friendly AI tools are changing the way people seek medical advice on symptoms and diagnoses. The age of “Dr. The age of “Dr. ChatGPT. Medical schools, physicians and patient groups are all racing to catch up. They are trying to determine the accuracy of these LLMs’ medical answers, how patients and doctors can best use them, and what to do if they are given incorrect information. Adam Rodman, a Harvard Medical School professor and practicing physician, says that he is confident that the chatbots will improve health care. “You can imagine many ways that people could communicate with LLMs, which might be connected to their medical records.”

Rodman’s own hospital rounds have already seen patients use AI chatbots. He was juggling more than a dozen patient’s care during a recent shift. One woman, frustrated with a long waiting time, took screenshots of her medical records to input into an AI chatbot. Rodman says that the woman said, “I already asked ChatGPT,” and it gave her an accurate answer about her condition, which was a blood disorder.

Rodman was not put off by the conversation. He believes that AI can improve interactions between doctors and patients by providing better information. “I see this as an opportunity to talk with the patient and find out what their concerns are,” he says.

Potential is the key word. In certain circumstances, AI can provide accurate medical advice and diagnosis. However, when these tools are in the hands of people – whether they are doctors or patients – accuracy is often compromised. Users can make mistakes, such as not giving all of their symptoms to AI or ignoring the right information when it’s given back to them.

in one For example,researchers asked physicians to estimate the likelihood that patients had different diseases based on their symptoms and histories, and again after seeing the lab results. One group was given AI assistance, while the other did not. Both groups scored similarly on a measure that measures not only the accuracy of diagnosis, but also how they explained their reasoning and considered alternatives. The AI-assisted team scored 76 percent on the median diagnostic reasoning scale, while the group that used only standard resources scored 74. When the AI was tested without any human input, it scored much higher with a median of 92 percent.

Harvard’s Rodman was involved in this study. He says that when the research was done in 2023, AI-chatbots were relatively new. This may have affected doctors’ ability to make an accurate diagnosis because they weren’t familiar with these tools. Beyond that, the broader insight revealed that doctors still viewed themselves the primary information filter. He says that doctors loved it when the machine agreed with them and ignored it when they disagreed. “They didn’t believe it when the machine said that they were wrong.”

Rodman tested AI on a difficult case a few years back that he and several other specialists misdiagnosed at first. He gave the tool the information he knew about the patient, and “the first thing it spat back was the very rare condition that this patient had,” says Rodman. The AI also suggested a more common diagnosis, but it was deemed less likely. Rodman and his team had initially misdiagnosed this patient. Another

Preprint study of over 1,200 participants revealed that AI provided the correct diagnosis almost 95 percent of time on its own, but only a third when people used these tools to guide their thinking.

One scenario in the study included a sudden onset of a headache and stiff neck. It is important to seek medical attention immediately if you suspect a serious condition such as meningitis or brain hemorrhage. Some users were able use the AI to get the right answer. Others were told to take pain medication over-the-counter and lie in a darkened room. The study found that the AI generated incorrect answers when the user didn’t mention the sudden onset symptoms.

But whether the information is correct or incorrect, AI presents its responses confidently, as true, even when this answer may be completely false–and that’s what’s a big problem, says Alan Forster. Forster is a doctor and a professor of innovation at McGill University. AI chatbots are prose-based, unlike an internet search which returns a list with links and websites to check out. Forester says that structured text makes the chatbot’s message appear more authoritative. “It is very well constructed and it feels more real.”

Even if the AI agent is correct, it can’t replace the knowledge that physicians gain from experience, says Jaime Knopman, a fertility doctor. Patients at her midtown Manhattan clinic bring her information from AI bots. It’s not necessarily wrong, but the LLM may not be the right approach for the patient’s case.

Couples will receive grades when considering IVF for their embryos. Knopman says that relying on the scores alone to make recommendations for next steps is not taking into account other important factors. It’s not only about the grade. There are other factors that go into it, such as the date the embryo was biopsyed, the condition of the uterine lining and whether the patient has had success with fertility in the past. Knopman, who has a medical degree and years of training, says that she has also “taken care” of thousands of women. This, she says gives her real-world insight into what to do next, which an LLM does not have. Knopman says that other patients will come to her with a clear idea of what they want done for an embryo transfer based on the response they received from AI. She says that while the method suggested by AI may be common, it may not be the best option for the patient’s specific circumstances. She says that there is the science which we study and learn, but also the art of why a certain treatment protocol or modality is better for the patient.

Some companies behind these AI bots have developed tools to address concerns regarding the medical information provided. OpenAI, ChatGPT’s parent company, announced that on May 12, it was launching HealthBench. This system is designed to measure AI’s abilities to respond to health questions. OpenAI claims that the program was developed with the help of over 260 doctors in 60 countries. It includes 5,000 simulated conversations between users and AI models with a scoring system designed by doctors. The company claims that earlier versions of their AI models allowed doctors to improve the responses generated by chatbots, but that the latest models available as of April 20,25, like GPT-4.1 are as good or better than human doctors. Open AI’s website states that “our findings show that large-language models have improved significantly over the years and already outperform expert writers in writing responses to example tested in our benchmark.” “Yet, even the most advanced systems have significant room for improvement. Especially in seeking out necessary context for underspecified questions and worst-case reliability.” Microsoft claims to have created a new AI-based system called MAI Diagnostic Orchestrator, which in tests diagnosed patients four times more accurately than human doctors. The system works by querying a number of large language models, including OpenAI’s GPT and Google’s Gemini. Meta’s Llama and xAI Grok are also used. Bernard S. Chang is the dean of Harvard Medical School’s medical education. He says that new doctors will have to learn to use AI tools and to counsel patients who use them. His university was the first to offer classes to students on how to use technology in their practice. Chang says that it’s “one of the most exciting developments in medical education right now.”

Chang says the situation reminds him of 20 years ago when people began to turn to the internet for information on medical issues. Patients would say to hi m, “I hope that you’re one of those doctors who uses Google.” As the search engine became more popular, he wanted them to know, “You wouldn’t go to a physician who didn’t.” “What kind doctor practices at the forefront of medical science and doesn’t utilize this powerful tool?”

www.aiobserver.co

More from this stream

Recomended