Skip to content
Vol. 1 · Ed. 2026
CyberGlossary
Entry № 576

k-Anonymity

What is k-Anonymity?

k-AnonymityA privacy model proposed by Latanya Sweeney that requires every record in a dataset to be indistinguishable from at least k-1 others based on its quasi-identifiers.


k-Anonymity, formalized by Sweeney in 2002, protects against re-identification by ensuring that quasi-identifier combinations (such as age, ZIP code, and gender) each appear in at least k records, forming equivalence classes. It is achieved via generalization (replacing exact values with ranges or broader categories) and suppression (removing rare values), often using algorithms like Mondrian or Incognito. While k-anonymity reduces linkage attacks, it does not protect against homogeneity or background-knowledge attacks if a sensitive attribute is identical within an equivalence class, motivating l-diversity and t-closeness extensions. Practitioners pick k based on data utility, risk appetite, and regulatory expectations under GDPR Recital 26.

Examples

  1. 01

    A medical dataset generalized so that every age/ZIP combination matches at least five patients (k=5).

  2. 02

    Generalizing date of birth to year-only to satisfy k-anonymity in a public research release.

Frequently asked questions

What is k-Anonymity?

A privacy model proposed by Latanya Sweeney that requires every record in a dataset to be indistinguishable from at least k-1 others based on its quasi-identifiers. It belongs to the Privacy & Data Protection category of cybersecurity.

What does k-Anonymity mean?

A privacy model proposed by Latanya Sweeney that requires every record in a dataset to be indistinguishable from at least k-1 others based on its quasi-identifiers.

How does k-Anonymity work?

k-Anonymity, formalized by Sweeney in 2002, protects against re-identification by ensuring that quasi-identifier combinations (such as age, ZIP code, and gender) each appear in at least k records, forming equivalence classes. It is achieved via generalization (replacing exact values with ranges or broader categories) and suppression (removing rare values), often using algorithms like Mondrian or Incognito. While k-anonymity reduces linkage attacks, it does not protect against homogeneity or background-knowledge attacks if a sensitive attribute is identical within an equivalence class, motivating l-diversity and t-closeness extensions. Practitioners pick k based on data utility, risk appetite, and regulatory expectations under GDPR Recital 26.

How do you defend against k-Anonymity?

Defences for k-Anonymity typically combine technical controls and operational practices, as detailed in the full definition above.

What are other names for k-Anonymity?

Common alternative names include: k-Anonymization.

Related terms