AI FactScan AI FactScan
← Blog

AI Fake Citations and Hallucinated References: Why They Look Real

AI-generated citations can look perfectly formatted and credible, even when the source does not exist, or does not say what the AI claims it does.

AI citation errors are not just random typos. They happen because large language models are built to produce plausible text, not to verify evidence.

When an AI system writes a citation, it may be doing something very different from what a human reader expects. The user sees an author name, a journal title, a year, a DOI, and a confident sentence. But the model may only be continuing a pattern it has learned from millions of academic-looking examples.

That is why fake citations can feel so convincing.

A citation can look complete before it is real

A fabricated reference often has all the surface features of a real one: recognizable author names, a plausible journal, a reasonable publication year, clean formatting, and sometimes even a DOI-shaped string. The problem is that none of these features prove the source exists. They only prove that the citation looks like something that could exist.

The scale is no longer theoretical. Zhao et al. audited 111 million references across 2.5 million papers and estimated 146,932 hallucinated citations in 2025 alone. Walters and Wilder found that in short literature reviews generated by ChatGPT, 55% of GPT-3.5 citations and 18% of GPT-4 citations were fabricated.

The answer can sound right, the citation can look right, and the connection between them can still be broken.

Failure mode 1: the source does not exist

The entire citation may be invented. The paper does not exist, the DOI does not resolve, or the journal issue never contained that article.

The AI might cite "Johnson et al., 2021" in a real-sounding journal, with a plausible title and a DOI-shaped link. Everything looks academic. The paper still is not there.

Failure mode 2: the paper exists, but the claim is wrong

The paper may exist, but the AI assigns it the wrong finding. This is harder to catch precisely because the first check passes. A reader clicks the link, sees a real paper, and relaxes. But the paper may not say what the AI said it says.

This is the difference between source existence and source support. A real citation can still be a bad citation if it does not support the sentence attached to it.

Failure mode 3: the statistic has no source

The AI may invent a statistic without a traceable source. Numbers are persuasive because they feel precise. But precision is not evidence.

A claim such as "300% growth since 2022" can move quickly through a report or presentation because it looks measurable. If no database, paper, or official report is attached to it, the number should be treated as unverified until proven otherwise.

Failure mode 4: uncertainty becomes consensus

The AI may turn uncertainty into consensus. A mixed research field becomes "studies show," "experts agree," or "there is strong evidence," even when the underlying literature is limited or divided.

This is especially risky in medical, legal, financial, and policy contexts, where a confident summary can hide important disagreement, limitations, or missing evidence.

What a first-pass check should ask

A citation is not verified just because it has a link. A DOI is not enough. A familiar institution is not enough.

The first-pass questions are simpler:

  • Does the cited source exist?
  • Does the metadata match the AI's citation?
  • Is the source authoritative enough for the claim?
  • Does the source actually support what the AI wrote?

The last question still requires reading. But the earlier questions can catch many weak, missing, self-referential, or suspicious sources before they enter your notes, report, classroom work, or decision-making process.

Where AI FactScan fits

AI FactScan is built for that first layer of caution. It does not replace reading the source. It does not decide truth for you. It helps surface source-quality signals early, so citation risks are easier to notice before you rely on them.

The bar is actually low: if a claim matters, the source has to be traceable. If a citation looks impressive, check whether it exists. If the source exists, check whether it actually supports the claim.

AI can move faster than you can verify. That gap is where mistakes get in.