Data anonymization is important because it protects the privacy and confidentiality of individuals whose data is being used.

Data de-identification vs. anonymization¶
The degree of data anonymization is important because it determines the sharing regulations. There are at least two levels of data anonymization White et al., 2022:
Fully anonymized data: All personal identifiers are removed, and a separate identification code is assigned. The link between the anonymized dataset and any trace back to the original data is permanently deleted
De-identified data: Personal information is removed from the dataset, and individuals are assigned a unique identification number. However, a key is retained that allows the de-identified data to be linked back to the original personal data if needed
Recommended tools for data anonymization¶
(coming soon!)
- White, T., Blok, E., & Calhoun, V. D. (2022). Data sharing and privacy issues in neuroimaging research: Opportunities, obstacles, challenges, and monsters under the bed. Human Brain Mapping, 43(1), 278–291. https://doi.org/10.1002/hbm.25120