Skip to content
Vol. 1 · Ed. 2026
CyberGlossary
Entry № 274

Data Anonymization

What is Data Anonymization?

Data AnonymizationIrreversibly transforming personal data so that no individual can be identified, directly or indirectly, even when combined with other available information.


Data anonymization removes or alters identifiers, quasi-identifiers, and sensitive attributes so that re-identification is no longer reasonably possible. Techniques include suppression, generalization, perturbation, aggregation, and randomization, often evaluated against privacy models such as k-anonymity, l-diversity, t-closeness, or differential privacy. Truly anonymized data falls outside the scope of GDPR (Recital 26), but the bar is high: regulators such as the EDPB and CNIL require formal re-identification risk assessments considering means "reasonably likely" to be used, including auxiliary datasets. Common pitfalls include relying on hashing alone, releasing high-dimensional micro-data, or treating pseudonymized data as anonymous.

Examples

  1. 01

    Publishing hospital readmission statistics aggregated by region and quarter, with cells below five suppressed.

  2. 02

    Releasing a public mobility dataset where trajectories are generalized to neighborhood-week granularity.

Frequently asked questions

What is Data Anonymization?

Irreversibly transforming personal data so that no individual can be identified, directly or indirectly, even when combined with other available information. It belongs to the Privacy & Data Protection category of cybersecurity.

What does Data Anonymization mean?

Irreversibly transforming personal data so that no individual can be identified, directly or indirectly, even when combined with other available information.

How does Data Anonymization work?

Data anonymization removes or alters identifiers, quasi-identifiers, and sensitive attributes so that re-identification is no longer reasonably possible. Techniques include suppression, generalization, perturbation, aggregation, and randomization, often evaluated against privacy models such as k-anonymity, l-diversity, t-closeness, or differential privacy. Truly anonymized data falls outside the scope of GDPR (Recital 26), but the bar is high: regulators such as the EDPB and CNIL require formal re-identification risk assessments considering means "reasonably likely" to be used, including auxiliary datasets. Common pitfalls include relying on hashing alone, releasing high-dimensional micro-data, or treating pseudonymized data as anonymous.

How do you defend against Data Anonymization?

Defences for Data Anonymization typically combine technical controls and operational practices, as detailed in the full definition above.

What are other names for Data Anonymization?

Common alternative names include: Anonymization, De-identification (strong sense).

Related terms

See also