l-Diversity
What is l-Diversity?
l-DiversityAn extension of k-anonymity introduced by Machanavajjhala et al. that requires each equivalence class to contain at least l well-represented values for every sensitive attribute.
l-Diversity, proposed in 2007, addresses two weaknesses of k-anonymity: homogeneity attacks (where every record in an equivalence class shares the same sensitive value) and background-knowledge attacks. By ensuring l distinct, well-represented values for sensitive attributes within each class, an attacker cannot pin down the sensitive value even after narrowing a target to a single class. Variants include distinct l-diversity, entropy l-diversity, and recursive (c, l)-diversity, each balancing privacy strength and information loss. l-Diversity is often combined with k-anonymity and t-closeness in privacy-preserving releases and is supported by tools like ARX, sdcMicro, and Amnesia for health, census, and survey datasets.
● Examples
- 01
Ensuring each age/ZIP group of patients includes at least three distinct diagnoses before publication.
- 02
Applying entropy l-diversity to a salary dataset so that no equivalence class is dominated by one income band.
● Frequently asked questions
What is l-Diversity?
An extension of k-anonymity introduced by Machanavajjhala et al. that requires each equivalence class to contain at least l well-represented values for every sensitive attribute. It belongs to the Privacy & Data Protection category of cybersecurity.
What does l-Diversity mean?
An extension of k-anonymity introduced by Machanavajjhala et al. that requires each equivalence class to contain at least l well-represented values for every sensitive attribute.
How does l-Diversity work?
l-Diversity, proposed in 2007, addresses two weaknesses of k-anonymity: homogeneity attacks (where every record in an equivalence class shares the same sensitive value) and background-knowledge attacks. By ensuring l distinct, well-represented values for sensitive attributes within each class, an attacker cannot pin down the sensitive value even after narrowing a target to a single class. Variants include distinct l-diversity, entropy l-diversity, and recursive (c, l)-diversity, each balancing privacy strength and information loss. l-Diversity is often combined with k-anonymity and t-closeness in privacy-preserving releases and is supported by tools like ARX, sdcMicro, and Amnesia for health, census, and survey datasets.
How do you defend against l-Diversity?
Defences for l-Diversity typically combine technical controls and operational practices, as detailed in the full definition above.
What are other names for l-Diversity?
Common alternative names include: l-Diverse k-Anonymity.
● Related terms
- privacy№ 576
k-Anonymity
A privacy model proposed by Latanya Sweeney that requires every record in a dataset to be indistinguishable from at least k-1 others based on its quasi-identifiers.
- privacy№ 1126
t-Closeness
A privacy model by Li, Li, and Venkatasubramanian that strengthens l-diversity by limiting how far the distribution of a sensitive attribute in any class differs from its global distribution.
- privacy№ 274
Data Anonymization
Irreversibly transforming personal data so that no individual can be identified, directly or indirectly, even when combined with other available information.
- privacy№ 317
Differential Privacy
A mathematical framework that quantifies privacy loss when releasing statistics or training models, by adding calibrated noise so any single individual's contribution is provably bounded.
- privacy№ 875
Pseudonymization
A technique that replaces direct identifiers in personal data with reversible aliases, so that the data can no longer be attributed to an individual without additional, separately kept information.
- privacy№ 280
Data Minimization
A privacy principle requiring organizations to collect, process, and retain only the personal data that is strictly necessary for a defined, lawful purpose.