AI Hallucination
What is AI Hallucination?
AI HallucinationA failure mode in which a generative AI system outputs content that is fluent and confident but factually wrong, fabricated, or unsupported by its sources.
Hallucinations arise from the statistical nature of generative models, which predict plausible continuations rather than verified facts. They include fabricated citations, invented API parameters, non-existent court cases (as in the 2023 Mata v. Avianca incident), hallucinated CVE numbers, or unsupported claims in RAG answers. Hallucinations become security issues when users act on the false output — installing a non-existent npm package an LLM "slopsquatted", trusting fabricated legal guidance, or building exploit code on imagined behaviour. Mitigations include retrieval-augmented generation with citations, structured outputs, tool calling for factual lookups, evaluation suites (TruthfulQA, FActScore), calibrated abstention, and human review for high-stakes domains.
● Examples
- 01
An LLM citing a 2023 court case that does not exist, complete with fabricated docket numbers.
- 02
A coding assistant recommending an npm package name that has never been published, opening the door to typosquatting.
● Frequently asked questions
What is AI Hallucination?
A failure mode in which a generative AI system outputs content that is fluent and confident but factually wrong, fabricated, or unsupported by its sources. It belongs to the AI & ML Security category of cybersecurity.
What does AI Hallucination mean?
A failure mode in which a generative AI system outputs content that is fluent and confident but factually wrong, fabricated, or unsupported by its sources.
How does AI Hallucination work?
Hallucinations arise from the statistical nature of generative models, which predict plausible continuations rather than verified facts. They include fabricated citations, invented API parameters, non-existent court cases (as in the 2023 Mata v. Avianca incident), hallucinated CVE numbers, or unsupported claims in RAG answers. Hallucinations become security issues when users act on the false output — installing a non-existent npm package an LLM "slopsquatted", trusting fabricated legal guidance, or building exploit code on imagined behaviour. Mitigations include retrieval-augmented generation with citations, structured outputs, tool calling for factual lookups, evaluation suites (TruthfulQA, FActScore), calibrated abstention, and human review for high-stakes domains.
How do you defend against AI Hallucination?
Defences for AI Hallucination typically combine technical controls and operational practices, as detailed in the full definition above.
What are other names for AI Hallucination?
Common alternative names include: LLM hallucination, Confabulation.
● Related terms
- ai-security№ 898
RAG Security
The discipline of securing retrieval-augmented generation pipelines so that the documents, vector stores, and retrieval steps that feed an LLM cannot be poisoned, abused, or used to exfiltrate data.
- ai-security№ 033
AI Safety
The discipline that aims to prevent AI systems from causing unintended harm to users, operators, and society — covering technical, operational, and societal dimensions.
- ai-security№ 026
AI Content Detection
Tools and techniques that estimate whether a piece of text, image, audio, or video was produced by an AI model rather than a human.
- ai-security№ 034
AI Supply Chain Risk
The set of threats arising from the third-party datasets, base models, libraries, plug-ins, and infrastructure that organisations combine to build and deploy AI systems.
- ai-security№ 777
OWASP LLM Top 10
An OWASP-maintained list of the ten most critical security risks affecting applications that build on large language models.
- ai-security№ 618
LLM Guardrails
Mechanisms that constrain what an LLM-based application can input or output, enforcing safety, security, and business rules around the underlying model.
● See also
- № 024AI Alignment