Evasion Attack (ML)
What is Evasion Attack (ML)?
Evasion Attack (ML)An inference-time attack in which an adversary crafts inputs that bypass a deployed machine-learning model's intended decision, such as evading a malware classifier or content filter.
Evasion attacks operate after a model is trained and deployed: the attacker does not touch the training pipeline but manipulates queries to slip past detection. Most use adversarial examples, but the family also includes simpler bypass tactics like polymorphic malware, character obfuscation against text moderation, voice-cloning against speaker verification, or transformations against perceptual hashing. The NIST AI 100-2 report categorizes evasion as one of the four main adversarial ML threat classes alongside poisoning, privacy, and abuse. Defences include adversarial training, robust feature engineering, ensemble or multi-modal detection, runtime input sanitization, telemetry on confidence drift, and tight access controls on model APIs to limit query-based reconnaissance.
● Examples
- 01
Obfuscated malware that a static ML classifier rates as benign while still executing its payload.
- 02
Homoglyph-laden text that bypasses a toxicity classifier but reads identically to a human.
● Frequently asked questions
What is Evasion Attack (ML)?
An inference-time attack in which an adversary crafts inputs that bypass a deployed machine-learning model's intended decision, such as evading a malware classifier or content filter. It belongs to the AI & ML Security category of cybersecurity.
What does Evasion Attack (ML) mean?
An inference-time attack in which an adversary crafts inputs that bypass a deployed machine-learning model's intended decision, such as evading a malware classifier or content filter.
How does Evasion Attack (ML) work?
Evasion attacks operate after a model is trained and deployed: the attacker does not touch the training pipeline but manipulates queries to slip past detection. Most use adversarial examples, but the family also includes simpler bypass tactics like polymorphic malware, character obfuscation against text moderation, voice-cloning against speaker verification, or transformations against perceptual hashing. The NIST AI 100-2 report categorizes evasion as one of the four main adversarial ML threat classes alongside poisoning, privacy, and abuse. Defences include adversarial training, robust feature engineering, ensemble or multi-modal detection, runtime input sanitization, telemetry on confidence drift, and tight access controls on model APIs to limit query-based reconnaissance.
How do you defend against Evasion Attack (ML)?
Defences for Evasion Attack (ML) typically combine technical controls and operational practices, as detailed in the full definition above.
What are other names for Evasion Attack (ML)?
Common alternative names include: Inference-time attack, Model evasion.
● Related terms
- ai-security№ 018
Adversarial Example
An input deliberately perturbed — often imperceptibly to humans — so that a machine-learning model produces a wrong or attacker-chosen prediction.
- ai-security№ 081
Backdoor Attack (ML)
A training-time attack that implants a hidden behaviour in a model so it acts normally on clean inputs but produces an attacker-chosen output whenever a secret trigger appears.
- ai-security№ 032
AI Red Team
A specialised team that simulates adversaries against AI systems to uncover safety, security, and misuse risks before real attackers do.
- ai-security№ 691
MLSecOps
The discipline of integrating security and risk controls across the entire machine-learning lifecycle, from data sourcing through training, deployment, monitoring, and retirement.
- ai-security№ 281
Data Poisoning
An attack on a machine-learning system in which adversaries inject, alter, or relabel training data so the resulting model behaves incorrectly or contains hidden backdoors.
- ai-security№ 777
OWASP LLM Top 10
An OWASP-maintained list of the ten most critical security risks affecting applications that build on large language models.