Anonymization of directory-structured sensitive data
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesisAlternative title
Anonymisering av katalogstrukturerad känslig data (Swedish)
Abstract [en]
Data anonymization is a relevant and important field within data privacy, which tries to find a good balance between utility and privacy in data. The field is especially relevant since the GDPR came into force, because the GDPR does not regulate anonymous data. This thesis focuses on anonymization of directory-structured data, which means data structured into a tree of directories. In the thesis, four of the most common models for anonymization of tabular data, k-anonymity, ℓ-diversity, t-closeness and differential privacy, are adapted for anonymization of directory-structured data. This adaptation is done by creating three different approaches for anonymizing directory-structured data: SingleTable, DirectoryWise and RecursiveDirectoryWise. These models and approaches are compared and evaluated using five metrics and three attack scenarios. The results show that there is always a trade-off between utility and privacy when anonymizing data. Especially it was concluded that the differential privacy model when using the RecursiveDirectoryWise approach gives the highest privacy, but also the highest information loss. On the contrary, the k-anonymity model when using the SingleTable approach or the t-closeness model when using the DirectoryWise approach gives the lowest information loss, but also the lowest privacy. The differential privacy model and the RecursiveDirectoryWise approach were also shown to give best protection against the chosen attacks. Finally, it was concluded that the differential privacy model when using the RecursiveDirectoryWise approach, was the most suitable combination to use when trying to follow the GDPR when anonymizing directory-structured data.
Place, publisher, year, edition, pages
2019. , p. 70
Keywords [en]
data anonymization, data privacy, directory-structured data, k-anonymity, l-diversity, t-closeness, differential privacy, GDPR
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:liu:diva-160952ISRN: LIU-IDA/LITH-EX-A--19/078--SEOAI: oai:DiVA.org:liu-160952DiVA, id: diva2:1361572
External cooperation
Cybercom Sweden
Subject / course
Computer Engineering
Supervisors
Examiners
2019-10-252019-10-162019-10-25Bibliographically approved