Adaptive Attack
What is Adaptive Attack?
Adaptive AttackAn attack on a machine-learning system that is specifically designed to evade or break a known defence, instead of using a generic, defence-agnostic technique.
An adaptive attack is constructed with full knowledge of the target defence and its assumptions, and its loss function or constraints are tailored to bypass that defence. The term was popularised by Carlini and Wagner, whose evaluations repeatedly showed that defences claiming robustness against generic adversarial examples were broken once an attacker designed an objective specifically against them. Adaptive attacks now serve as a standard benchmark: any proposed defence for adversarial examples, watermarking, or detection should be evaluated against adversaries that are aware of the defence and can adapt their methodology. Skipping this step routinely leads to overstated robustness claims that fall to simple, principled attacks.
● Examples
- 01
Carlini and Wagner break multiple adversarial-example detectors by retargeting their attack loss against each detector's specific decision rule.
- 02
An adaptive attack defeats a watermarking scheme for AI-generated images by optimising perturbations against the published detector.
● Frequently asked questions
What is Adaptive Attack?
An attack on a machine-learning system that is specifically designed to evade or break a known defence, instead of using a generic, defence-agnostic technique. It belongs to the AI & ML Security category of cybersecurity.
What does Adaptive Attack mean?
An attack on a machine-learning system that is specifically designed to evade or break a known defence, instead of using a generic, defence-agnostic technique.
How does Adaptive Attack work?
An adaptive attack is constructed with full knowledge of the target defence and its assumptions, and its loss function or constraints are tailored to bypass that defence. The term was popularised by Carlini and Wagner, whose evaluations repeatedly showed that defences claiming robustness against generic adversarial examples were broken once an attacker designed an objective specifically against them. Adaptive attacks now serve as a standard benchmark: any proposed defence for adversarial examples, watermarking, or detection should be evaluated against adversaries that are aware of the defence and can adapt their methodology. Skipping this step routinely leads to overstated robustness claims that fall to simple, principled attacks.
How do you defend against Adaptive Attack?
Defences for Adaptive Attack typically combine technical controls and operational practices, as detailed in the full definition above.
What are other names for Adaptive Attack?
Common alternative names include: Defence-aware attack, White-box adaptive evaluation.
● Related terms
- ai-security№ 1168
Transferable Adversarial Attack
An attack in which adversarial examples crafted against one machine-learning model also fool other, unseen models, enabling black-box attacks without access to the target.
- ai-security№ 018
Adversarial Example
An input deliberately perturbed — often imperceptibly to humans — so that a machine-learning model produces a wrong or attacker-chosen prediction.
- ai-security№ 032
AI Red Team
A specialised team that simulates adversaries against AI systems to uncover safety, security, and misuse risks before real attackers do.
- ai-security№ 026
AI Content Detection
Tools and techniques that estimate whether a piece of text, image, audio, or video was produced by an AI model rather than a human.