ChatGPT struggles with echocardiography, but still shows potential to help cardiology trainees

ChatGPT-4, the latest version of Open AI’s massively popular dialogue-based artificial intelligence (AI) model, delivered a “low performance” when tasked with answering questions about echocardiography, according to a new research letter published in JACC: Cardiovascular Imaging.[1]

“ChatGPT possesses the capacity to swiftly generate curated content, which lends to its utilization as a writing tool and, in some cases, as an author,” wrote co-author Arun Umesh Mahtani, MD, with the department of internal medicine at Richmond University Medical Center, and colleagues. “Currently, JACC journals prohibit the use of large language models (LLMs) and other AI methods for writing and authorship while requiring disclosure and responsibility for data integrity, if utilized. At the same time, the use of LLMs in publishing is expected to evolve.”

Mahtani et al. asked ChatGPT to answer 150 questions one might find in a board certification exam and explain its answers in a way that suggests a full understanding of the knowledge contained in a popular textbook, Clinical Echocardiography Review: A Self-Assessment Tool. This included open-ended (OE) questions, multiple choice without forced justification (MC-NJ) questions and multiple choice with forced justification (MC-J) questions. While some of the 150 questions were clinical vignettes, others were fact-based or formula-based questions.

Overall, ChatGPT provided acceptable answers to 47.3% of OE questions, 53.3% of MC-NJ questions and 55.3% of MC-J questions. Among the OE questions it answered correctly, 81.6% were fact-based questions, 12.6% were clinical vignettes and 5.6% were formula-based questions. A similar breakdown was seen in MC-NJ and MC-J questions.

“ChatGPT demonstrated the most accuracy in answering questions derived in MC-J format,” the authors wrote. “Fact-based questions had the highest percentage of correct answers in all three formats, likely because these questions are similar to natural language.”

While it is still too early to know if ChatGPT and other advanced algorithms can help trainees kickstart their careers, Mahtani and colleagues did note that ChatGPT 4 “may offer a role in generating new echocardiography board–type questions for trainees to practice.”

The group concluded its analysis by reviewing some of the things their work was not designed to examine.

“We did not compare the performance of ChatGPT with human trainees,” they wrote. “It appears reductionist to claim that ChatGPT’s ability is that of a board-certified cardiologist as passing the echocardiography boards requires clinical acumen. ChatGPT’s performance may be lower if asked to assess images such as those portrayed on echocardiography boards. Further studies are needed to address these limitations.”

Click here to read the full research letter in JACC: Cardiovascular Imaging.

Michael Walter
Michael Walter, Managing Editor

Michael has more than 18 years of experience as a professional writer and editor. He has written at length about cardiology, radiology, artificial intelligence and other key healthcare topics.

Around the web

Several key trends were evident at the Radiological Society of North America 2024 meeting, including new CT and MR technology and evolving adoption of artificial intelligence.

Ron Blankstein, MD, professor of radiology, Harvard Medical School, explains the use of artificial intelligence to detect heart disease in non-cardiac CT exams.

Eleven medical societies have signed on to a consensus statement aimed at standardizing imaging for suspected cardiovascular infections.