AI & Machine Learning

Fake Citations in Medical Papers Up 12-Fold Since 2023

Manaal Khan26 May 2026 at 6:42 pm5 دقيقة للقراءة

Key Takeaways

Fabricated references in biomedical papers rose from 4 per 10,000 in 2023 to 56.9 per 10,000 by early 2026
The fake citations look real: correct formatting, real author names, plausible publication years
Researchers call for automated reference checks before publication and retroactive screening of existing papers

The Scale of the Problem

Researchers at Columbia University and other institutions have published the largest-ever audit of citations in biomedical papers. The study, published in The Lancet, examined 2.47 million papers from the open PubMed Central archive published between January 2023 and February 2026.

The team, led by Maxim Topaz, checked 97.1 million references against four major databases: PubMed, Crossref, OpenAlex, and Google Scholar. A reference counted as fabricated if its listed title could not be found in any of these sources.

The results: 4,046 fabricated references spread across 2,810 papers. By early 2026, roughly 1 in 277 papers indexed in PubMed Central contained at least one non-existent reference.

12x increase

The rate of fabricated references in biomedical papers rose from 4 per 10,000 in 2023 to 56.9 per 10,000 by early 2026

A Timeline That Points to ChatGPT

The timeline tells the story. Throughout 2023, the rate held steady at about four fabricated references per 10,000 papers. Starting in mid-2024, it climbed fast. It hit 51.3 per 10,000 by the end of 2025 and reached 56.9 per 10,000 in the first seven weeks of 2026.

The researchers see an obvious connection to language models like ChatGPT, which took off in late 2022. Since papers typically take 100 to 200 days from submission to publication, AI-generated text would not show up in PubMed Central in large numbers until mid-2024. The timing matches almost exactly.

From mid-2024, there was a rapid increase in hallucinated references in the papers examined. Source: The Lancet

The authors do not rule out other causes, including increased paper-mill activity or changes in indexing practices. But the correlation with AI adoption is hard to ignore.

Why These Fake References Fool Reviewers

The real danger is that these fake references are hard to spot. They match the paper's topic. They follow correct formatting. They credit real researchers. They carry plausible publication years. Everything looks legitimate until you try to find the actual source.

In one urology paper flagged by the study, 18 of 30 checked references were fabricated. All of them closely matched the narrow surgical subject matter. A peer reviewer reading the paper would have no obvious reason to suspect anything was wrong.

“A medical professional... has no way of knowing that the evidence they are relying on does not exist.”

— Maxim Topaz, PhD, Associate Professor at Columbia University School of Nursing and Data Science Institute

The study also found that 85% of hallucinated references in preprints successfully bypassed peer review to reach final publication. The system is not catching these errors.

The Clinical Guideline Risk

The stakes are higher than academic embarrassment. Review articles, which synthesize findings from multiple studies, often shape clinical guidelines. Doctors use these guidelines to make treatment decisions. If a review article cites fabricated studies, those phantom references can influence real patient care.

A physician reading a systematic review has no practical way to verify every citation. The assumption is that peer review caught obvious problems. That assumption is breaking down.

Evidence of Paper Mills

The researchers found patterns pointing to coordinated activity beyond individual AI misuse. Two authors appeared in eleven papers from the same surgical journal, with a total of 15 fabricated references on topics like CRISPR diagnostics and the gut microbiome.

This suggests paper mills, which produce fraudulent research for profit, may be using AI tools at scale. The combination of AI-generated text and fabricated citations creates a supply chain for fake science.

What the Researchers Recommend

The study authors call for automated reference checks before publication. Every citation should be verified against major databases before a paper goes to print. The technology to do this exists. The question is whether publishers will implement it.

They also recommend retroactive screening of papers already published since 2023. The contamination is already in the literature. Finding and flagging affected papers would help researchers avoid citing fabricated work.

Automated reference verification before publication
Retroactive screening of papers published since 2023
Clear sanctions for authors who submit AI-hallucinated content
Better disclosure requirements for AI tool use in writing

Some platforms have already started acting. Arxiv has introduced initial sanctions for AI-related errors. But the response remains patchy across the publishing ecosystem.

The Community Response

Discussion on Hacker News and Reddit has focused on the broader crisis of scientific integrity. Many users argue this confirms the need to move away from traditional peer review toward mandatory, automated reference-validation pipelines.

There is significant cynicism about whether publishers will invest in verification tools without external regulatory pressure. The economic incentives do not obviously favor expensive validation systems.

ℹ️

Logicity's Take

Frequently Asked Questions

What is an AI-hallucinated citation?

A citation generated by an AI language model that looks legitimate but refers to a study that does not exist. The fake reference typically has correct formatting, plausible author names, and a relevant-sounding title.

How many papers are affected by fabricated references?

By early 2026, approximately 1 in 277 papers in PubMed Central contained at least one fabricated reference. The rate increased from 4 per 10,000 papers in 2023 to 56.9 per 10,000 by early 2026.

Why are fake citations in medical papers dangerous?

Medical professionals rely on cited research to make treatment decisions. If a clinical guideline is based on review articles containing fabricated references, doctors may follow recommendations that have no real evidence behind them.

How can researchers detect hallucinated citations?

By checking each cited work against databases like PubMed, Crossref, OpenAlex, and Google Scholar. If the exact title cannot be found in any major database, the reference is likely fabricated.

What is being done to address AI-generated fake citations?

Researchers are calling for automated reference verification before publication and retroactive screening of existing papers. Some platforms like Arxiv have introduced sanctions for AI-related errors, but implementation varies widely across publishers.

Need Help Implementing This?

Source: The Decoder / Maximilian Schreiner

اقرأ أيضاً

الأمن السيبراني·8 د

رأي مغاير: كيف يؤثر اختراق الأمن الداخلي الأميركي على شركاتنا الخاصة؟

في ظل اختراق عقود الأمن الداخلي الأميركي مع شركات خاصة، نناقش تأثير هذا الاختراق على مستقبل الأمن السيبراني. نستعرض الإحصاءات الموثوقة ونناقش كيف يمكن للشركات الخاصة أن تتعامل مع هذا التهديد. استمتع بقراءة هذا التحليل العميق

عمر حسن·١٦ مارس ٢٠٢٦

الروبوتات·8 د

الإنسان في زمن ما بعد الوجود البشري: نحو نظام للتعايش بين الإنسان والروبوت - Centre for Arab Unity Studies

في هذا المقال، سنناقش كيف يمكن للبشر والروبوتات التعايش في نظام متكامل. سنستعرض التحديات والحلول المحتملة التي تضعها شركات مثل جوجل وأمازون. كما سنلقي نظرة على التوقعات المستقبلية وفقًا لتقرير ماكنزي

فاطمة الزهراء·١٦ مارس ٢٠٢٦

أخبار التقنية·7 د

إطلاق ناسا لمهمة مأهولة إلى القمر: خطوة تاريخية نحو استكشاف الفضاء

تعتبر المهمة الجديدة خطوة هامة نحو استكشاف الفضاء وتطوير التكنولوجيا. سوف تشمل المهمة إرسال رواد فضاء إلى سطح القمر لconducting تجارب علمية. ستسهم هذه المهمة في تطوير فهمنا للفضاء وتحسين التكنولوجيا المستخدمة في استكشاف الفضاء.

عمر حسن·١٦ مارس ٢٠٢٦