News

When AI is wrong, it’s not great.

July 27, 2025

Biomedical visualisation specialists are still unsure about how to use generative AI when creating images for science and health applications. There is an urgent need for guidelines and best practices, as incorrect illustrations of anatomy or related subject matter can cause harm in clinical settings and online misinformation. Researchers from the University of Bergen, Norway, University of Toronto, Canada, and Harvard University, US, make this point in a paper entitled “‘It looks sexy but it’s wrong.’ Tensions in creativity and accuracy using GenAI for biomedical visualization,” that will be presented at IEEE Vis 2025 in November.

The authors of the paper Roxanne Ziman (from Norway), Shehryar Sharan (from Canada), Gael McGill (from the US) and Laura Garrison (from the UK) present various illustrations created using OpenAI’s GPT-4o, or DALL-E 3, alongside versions created by experts in visualization. Screenshots from paper.
Top Row: Incorrect GPT-4o and DALL-E 3. Bottom Row: Images created by BioVisMed Illustrators – Click to Enlarge

Several of the examples cited are not accurate. Some examples, such as “the a” a=”” across=”” ai generated=”” come=”” directly=”” examples=”” harmful=”” have=”” health-related=”” href=”https://taxadmin.ai/taxadminai/toeslagenaffaire/”images=”” in=”” interview=”” i’ve=”” light=”” not=”” of=””

The issue with AI-generated images is that they are used in scientific research and medical publications. While the potential harm isn’t immediately apparent, the increased use of inaccurate images like this and problems like reinforcing stereotypes in healthcareto communicate health and medical information is troubling.”

Ziman said that the larger problem, echoed in a series of interviews discussed in the paper, is the way inaccurate imagery affects how the public sees scientific research. She cited the “well-endowed rat” as an example of how inaccurate imagery affected public perceptions. Ziman said

“Satirical criticism by such public figures (that people may tend to trust more than ‘legitimate’ news sources) can throw into question the legitimacy of the scientific research community at large, and the public can come to distrust (even more) or not take seriously what they hear coming out of the scientific research community,” .

“Think of the consequences then for public health communications as during COVID, vaccine campaigns, etc. And bad actors now have greater ease of quickly creating and sharing misleading but convincing-looking imagery.”

Ziman stated that while AI-generated images are often shared in the BioMedVis community to be laughed at and criticised, practitioners have not yet figured out how to mitigate risks.

Caught the vibe that this coding style might cause problems
As AI grows in popularity, concerns about its impact on mental health grow
ServiceNow is looking to save $100M through AI-powered headcount reductions
Microsoft’s CEO feels burdened by job cuts.

To that end, they surveyed 17 BioMedVis experts to assess their views and how they use generative AI in their work. The survey respondents, who are referred to in the paper by pseudonyms, expressed a wide variety of views on generative AI. The authors grouped the survey respondents into five personas – Enthusiastic adopters, Curious Adapters (curious optimists), Cautious optimists (cautionary optimists), and Skeptical avoiders.

Some respondents appreciated the abstract, otherworldly aesthetics generated by AI models. They said the images helped advance conversation with clients. Others (about half of the respondents) were critical, agreeing “Frank,” that the generic look is boring.

Irrelevant, hallucinated, or invented terms such as ‘green glowing proteins’ are still a problem.

The survey-takers also used text-totext models for captions, but not always to the satisfaction or respondents. The paper notes that “Irrelevant or hallucinated references remain a problem, as do invented new terms, such as the ‘green glowing protein.'”

some survey respondents view generative AI as a useful tool for rote coding, such as generating boilerplate codes or cleaning data. Others, on the other hand, believe they’ve already spent time learning to program and prefer to use their skills rather than delegate.

Researchers also note a contradictory approach among respondents who “express grave concerns about intellectual property violations that are, for the moment, baked into public GenAI tools” generative AI when used for commercial purposes while also accepting its use on a personal level.

Despite the fact that 13 of 17 respondents already use GenAI in their production workflows, BioMedVis designers and developers still prioritize accuracy when creating images, as “GenAI in its current state is unable to achieve this benchmark,” observes the authors. They cite “Arthur”for their remarks: “While it’s still scraping the digital world for references it can use to generate art, it’s not yet able to know the difference between the sciatic nerve and the ulnar nerve. It’s just, you know, wires.”

“Ursula” also mentions GenAI’s inability of producing accurate anatomy. “Show me an organ, and MidJourney will say, Here is your pile alien eggs!”

They also raise more general concerns regarding the blackbox nature and difficulty of addressing bias. The paper explains that “Inaccurate or unreliable outputs, whether the anatomical visuals …or blocks of code, can mislead and diffuse responsibility. Participants questioned who should be held accountable in instances where GenAI is used and lines of accountability blur.” Blackbox model prevents this sort of accountability. As one survey respondent “Kim” stated, “There should be someone who can explain the results. It is about trust, and […] about competence.”

We should be comfortable as a community sharing our thoughts, concerns, and questions about these tools.

In an email, co-author Shehryar Sharan, from the University of Toronto, told The Register that he hoped this research would encourage people to think critically about the role of generative AI in the work and values of BioMedVis practitioners. Saharan said

“These tools are becoming a bigger part of our field, and it’s important that we don’t just use them, but critically reflect on what they mean for how we work and why we do what we do,” .

“As a community, we should feel comfortable sharing our thoughts, questions, and concerns about these tools. Without open conversation and a willingness to reflect, we risk falling behind or using these technologies in ways that don’t align with what we actually care about. It’s about making space to think and reflect before we move forward.” (r)