Key Takeaways

- Multiple scientific papers have been retracted in 2024-2026 for publishing AI-generated figures with biologically impossible structures
- The tools for creating synthetic scientific imagery far outpace current detection capabilities
- AI product teams need to consider provenance and verification features as core requirements, not afterthoughts
AI-generated images have begun appearing in peer-reviewed scientific papers, and some have slipped past editorial review until after publication. The New England Journal of Medicine retracted a paper in April 2026 after discovering a clinical image had been manipulated with AI. Two papers were pulled in 2024 for publishing AI-generated figures showing biologically impossible structures. These public cases represent what researchers call the tip of the iceberg.
The problem extends beyond obvious fakes. A photograph of Earth from the Artemis II mission caught public attention in April 2026, reminiscent of Apollo 8's iconic Earthrise image. But the visual similarity between authentic NASA photography and what any person can generate from a text prompt in seconds raises a question that now applies to scientific evidence broadly: how do people decide which image is real?
Where AI images show up in science
Researchers already use AI tools across the visual production pipeline. They generate illustrations for papers, create synthetic data for training models, edit lab images, and produce materials for education and public outreach. The same capabilities that help scientists communicate complicated ideas more efficiently also blur the lines between illustration, enhancement, and fabrication.
Elisabeth Bik, a science integrity consultant who has detected image manipulation in thousands of papers, estimates that roughly 1 in 25 biomedical papers contains problematic images. Her work predates the current generation of AI image tools. The detection challenge has only grown harder.
Fields that depend heavily on visual evidence face the sharpest risks. Materials science researchers have warned that AI-generated visuals pose growing threats to their discipline, where microscopy images and structural analyses form the evidentiary backbone of papers.
Why detection keeps falling behind
The asymmetry between creation and detection tools creates a structural problem. Generating a plausible scientific image takes seconds and costs nearly nothing. Verifying that an image accurately represents what it claims to show requires expertise, time, and often access to original data that may not be publicly available.
Hany Farid, a UC Berkeley professor who specializes in digital forensics, has described how AI fundamentally changes the epistemology of seeing as believing. The scientific image has always been a constructed representation. Microscopes, telescopes, and medical imaging devices all involve choices about what to capture and how to render it. But AI introduces a new category: images that look like evidence but represent nothing real.
This erosion of visual credibility feeds into broader trust problems. Pew Research found that 40% of Americans now report low trust in scientists, up from historical lows. AI-generated imagery in public spaces does not cause this distrust on its own, but it accelerates it by making authentic evidence harder to distinguish from fabrication.
What journals and institutions are doing
Some journals now require authors to disclose AI use in image creation or editing. The New England Journal of Medicine retraction suggests that even high-prestige publications with rigorous review processes can miss manipulated images. Post-publication detection currently relies heavily on reader reports and the work of independent integrity consultants like Bik.
Automated detection tools exist but have not kept pace with generation quality. Most solutions involve comparing submitted images against databases of known manipulated images or looking for statistical anomalies in pixel data. Newer generative models produce outputs that evade these checks.
The research community has begun discussing provenance standards that would attach cryptographic signatures to images at capture time, similar to C2PA (Coalition for Content Provenance and Authenticity) standards being developed for news photography. Adoption remains limited.
What this means for AI product teams
Teams building AI tools face a choice about how much responsibility to take for downstream misuse. Image generation platforms like Midjourney, DALL-E, and Stable Diffusion have implemented various content policies, but none specifically address scientific imagery as a category. A generated image of a protein structure or a microscopy result falls outside typical content moderation focused on violence, explicit content, or celebrity deepfakes.
The verification side presents a product opportunity. Startups and research labs are working on detection tools, but no market leader has emerged. The technical challenge remains formidable: as generation improves, detection must improve faster just to stay even.
Logicity's Take
For AI product teams, the scientific imagery problem previews a broader challenge: when your tool can produce outputs indistinguishable from evidence, you inherit some responsibility for how users distinguish the two. The winning approach will likely involve provenance features baked into generation, not detection bolted on after. Teams building image generation tools should watch C2PA adoption closely. Content credentials may become table stakes for enterprise and scientific use cases within 2-3 years. Companies like Adobe have already integrated C2PA into Creative Cloud; generative AI tools that lack similar features may face access restrictions from institutional buyers.
The shift from 'real unless proven fake' to 'suspect until verified'
Elisabeth Bik has suggested that the default assumption may need to flip. Instead of treating images as real unless proven fake, the scientific community may need to treat images as suspect unless verified. This shift would require infrastructure that does not yet exist at scale: standardized provenance metadata, verification tools integrated into journal submission systems, and training for reviewers.
The transition will be painful for legitimate researchers who now face additional documentation burdens. It will also create new attack surfaces, as provenance systems themselves become targets for manipulation.
What remains unclear is whether verification infrastructure can scale fast enough to preserve the evidentiary role of images in science. The alternative, a research culture where visual evidence carries little weight, would represent a significant step backward for fields that depend on it.
Frequently Asked Questions
How many scientific papers have been retracted for AI-generated images?
Confirmed retractions remain in the dozens, but researchers estimate the actual number of problematic images is far higher. Elisabeth Bik's analysis suggests roughly 1 in 25 biomedical papers contains problematic images, though not all are AI-generated.
Can AI-generated scientific images be detected?
Detection tools exist but lag behind generation capabilities. Most rely on statistical analysis of pixel data or comparison against databases of known manipulations. Newer generative models increasingly evade these checks.
What journals require AI disclosure for images?
Many major journals now require authors to disclose AI use in figure creation or editing. However, enforcement relies heavily on author honesty, and post-publication detection depends on reader reports and independent integrity consultants.
What is C2PA and how does it relate to scientific images?
C2PA (Coalition for Content Provenance and Authenticity) is a standard for attaching cryptographic provenance data to images at capture time. It is being developed primarily for news photography but could apply to scientific imaging. Adoption in research contexts remains limited.
Need Help Implementing This?
Building AI products that handle sensitive visual content? Logicity covers the tools, standards, and policy developments that shape what responsible AI deployment looks like. Subscribe for weekly analysis tailored to AI builders and product teams.
Source: Fast Company / The Conversation
Huma Shazia
Senior AI & Tech Writer
Produced with AI assistance and reviewed by the Logicity editorial team. Learn more in our Editorial Policy.
Related Articles
Browse all
AI Search Trust Problem: Why 85% of Users Doubt Results
New research reveals a massive gap between AI search adoption and user trust. Two-thirds of Americans use AI search tools, but only 15% trust the results. For businesses relying on AI-powered discovery, this trust deficit represents both a risk and an opportunity.

INSIDER REVEAL: How the American Enterprise Institute Uncovered the AI Productivity Boom
The American Enterprise Institute has been searching for signs of an AI-driven productivity boom. According to McKinsey, AI can increase productivity by up to 40%. We dive into the details of this emerging trend and what it means for businesses.

Will AI Ethics Regulation Become the New Industry Standard?
The Vatican has emphasized the need for AI ethics regulation in a recent statement, sparking a global conversation about responsible AI development. We explore the implications of this call to action and what it means for businesses and individuals alike. As AI continues to shape our world, we must consider the ethical implications of its development and deployment.



