Why LLMs Hallucinate (And What We Can Do About It)
LLMs confidently make things up — inventing case law, fake citations, and false facts. Learn why hallucinations happen and how RAG, CoT, and other techniques reduce them.
Why LLMs Hallucinate (And What We Can Do About It)
In 2023, a New York lawyer submitted a legal brief containing six case citations that didn't exist. His source? ChatGPT, which had invented the cases — complete with plausible docket numbers, judge names, and legal reasoning. The lawyer didn't verify them. The judge was not amused.
This is hallucination: when a language model generates text that is fluent, confident, and completely wrong. It's the single biggest barrier to trusting AI in high-stakes applications. Understanding why it happens is the first step toward dealing with it.
What counts as a hallucination?
Not all errors are hallucinations. A taxonomy:
Factual hallucination: The model states something verifiably false as fact. "The Eiffel Tower was built in 1920" (it was 1889). The model presents the claim with the same confidence as a correct answer.
Fabrication: The model invents entities that don't exist. Fake research papers with fake authors. Non-existent legal cases. Made-up URLs that return 404.
Intrinsic hallucination: The model contradicts its own source material. When summarizing a document, it introduces details that aren't in the original.
Extrinsic hallucination: The model adds plausible but unverified information. When asked about a person, it fills in biographical details that sound reasonable but were never stated.
All of these share a common thread: the model generates text that sounds right but isn't.
Why it happens: Pattern matching, not knowledge retrieval
The root cause is architectural. An LLM doesn't have a database of facts that it looks up. It has learned statistical patterns from training data, and it generates text by predicting what tokens are most likely to follow the current sequence.
When you ask "Who invented the telephone?" the model doesn't retrieve a fact from a knowledge store. It generates tokens that are statistically likely to follow that question based on patterns seen in training. Usually, this produces "Alexander Graham Bell" — because that sequence followed similar questions many times in training data.
But this process can go wrong in several ways:
1. The training data is inconsistent
The internet contains contradictory information. If the model saw 80 pages saying X and 20 saying Y, it learned a probability distribution, not a verified fact. Sometimes the minority answer surfaces, especially at higher temperatures.
2. Pattern completion overrides factual accuracy
LLMs are optimized for fluent, plausible-sounding continuations. If the statistical pattern demands a name, date, or citation, the model will produce one — even if it has to make it up. Generating a plausible-looking citation (Author, Year, "Title in Proper Case," Journal Name, Vol. X, pp. Y-Z) is trivially easy for a model trained on academic text. Whether that specific citation exists is a completely different question that the model has no mechanism to verify.
3. The model can't distinguish "I know this" from "this sounds right"
Humans have metacognition — we can sense our own uncertainty. We know the difference between remembering a fact and guessing. LLMs don't have this. The generation process is the same whether the model is producing a well-established fact or an educated guess. There's no internal "confidence flag" — just probabilities over next tokens.
4. Compositionality creates novel falsehoods
The model might know that Fact A is true and Fact B is true, but when combining them in a new context, it can produce a conclusion that doesn't follow. Each step is locally plausible, but the chain produces a falsehood that never appeared in training data.
5. Long generations drift
In longer outputs, the model's context is increasingly dominated by its own generated text rather than the original prompt or training data. Small errors compound. A slightly wrong detail in paragraph 2 becomes the basis for further wrong details in paragraph 5. This is sometimes called "snowballing."
Famous hallucination examples
- The lawyer brief (2023): ChatGPT fabricated six legal cases cited in a federal court filing, complete with fake docket numbers and plausible reasoning.
- Google Bard's launch demo (2023): The first public demo of Bard contained a factual error about the James Webb Space Telescope, wiping $100 billion from Alphabet's market cap.
- Fake academic citations: Multiple studies have found that LLMs produce non-existent papers when asked for references, sometimes citing real authors on non-existent publications.
- Confident biography errors: Models frequently embellish biographies of real people — inventing awards, publications, or affiliations that sound plausible but never happened.
Mitigation strategies
No current technique eliminates hallucinations entirely, but several reduce their frequency significantly:
Retrieval-Augmented Generation (RAG)
Instead of relying on the model's parametric memory, retrieve relevant documents from a verified knowledge base and inject them into the prompt. The model generates its answer based on the retrieved text rather than its own training patterns.
RAG dramatically reduces factual hallucination because the model has the correct information right there in its context. But it doesn't eliminate the problem — the model can still misinterpret, misquote, or ignore the retrieved content.
Chain-of-thought prompting
Asking the model to "think step by step" before answering forces it to make its reasoning explicit. This makes errors more visible (to both humans and automated checkers) and often improves accuracy because intermediate reasoning steps constrain the final answer.
Self-consistency and verification
Generate multiple answers to the same question and compare them. If the model gives different answers each time, it's uncertain. If all answers agree, confidence is higher (though not guaranteed).
Some systems use a second LLM call to fact-check the first output, or use the model to identify which of its own claims it's least confident about.
Grounded generation with citations
Train or prompt the model to cite its sources for every claim. When the model must point to a specific document or passage, fabrication becomes harder (though not impossible — models can cite documents that support claims the documents don't actually make).
Fine-tuning for calibration
RLHF and similar alignment techniques can train the model to say "I'm not sure" or "I don't have reliable information about that" instead of guessing. This reduces the frequency of hallucinations by teaching the model to hedge when appropriate.
Structured output constraints
For applications where hallucination is particularly dangerous, constrain the output format. Force the model to select from a predefined list of options rather than generating free text. Use tool calls that query verified databases rather than relying on parametric knowledge.
The fundamental tension
Hallucination isn't a bug that can be patched — it's a consequence of how these models work. A system that generates text by predicting likely continuations will sometimes generate plausible-sounding falsehoods, because plausibility and truth are different things.
The challenge for the field is building systems that maintain the fluency and flexibility of generative language models while adding reliable mechanisms for factual grounding. We're making progress — RAG, tool use, and better alignment all help — but the problem isn't solved.
For now, the practical advice is clear: use LLMs as powerful thinking partners, not as authoritative sources. Verify critical claims. Build systems that ground model output in verified data. And never submit an AI-generated legal brief without reading the cases yourself.
Explore LLM limitations interactively
See how hallucinations relate to the generation process in our interactive limitations section — explore the boundary between what models know, what they infer, and what they fabricate.