Back in May, the Information Commissioner's Office (ICO) released its first chapter of its public consultation on its draft anonymisation, pseudonymisation and privacy enhancing technologies guidance, with the aim of helping organisations consider the key issues in anonymising personal data. Chapter 1 focused on introducing key concepts in the context of UK data protection law, with the hot-off-the-press Chapter 2 delving into the question of "how do we ensure anonymisation is effective?".
The ICO is seeking feedback on each Chapter in turn (to a total of 8 chapters) in order to maximise the amount and detail of feedback it receives ahead of publishing its final complete guidance, which will be subject to formal consultation.
A pragmatic stance is taken throughout this chapter of the draft guidance, including a clarification that anonymisation is a process that seeks to reduce the risk of an individual being identified to a sufficiently remote level. The ICO recognises that it is not always possible to reduce the risk of idetifiability to zero, and therefore effective anonymisation is a balancing act between managing this risk whilst maintaining the usefulness of the data.
Note that this "sufficiently remote" level depends on a number of context-specific factors detailed throughout the draft guidance. "Identifiability" is at its core about distinguishing one person from others - the simple removal of direct identifiers such as someone's name is not sufficient for these purposes (but is recommended to help you keep on the right side of the data minimisation principle!).
The guidance sets out 3 key indicators of whether information is personal data, or not, these being:
1. Singling out - can you tell one individual apart from others within a dataset? Consider the richness of the data, how potentially identifying the categories of data are, and whether you have sufficient safeguards in place to reduce this risk.
2. Linkability - can separate data sources be combined to make someone identifiable? Techniques such as masking and tokenisation of identifying variables (e.g., sex, age etc.) are useful here. Note that if there are other records which could be linked to enable identifiability, this means that the data is pseudonymised.
3. Inferences - is there potential for someone to infer, guess or predict details about someone, e.g. using information from various sources to deduce something based on the qualities others who appeared similar? To determine the likelihood of identifiability through inference, look at whether someone could be identified using incomplete datasets, difference pieces of information from the same dataset, or from other information you have or may reasonably be expected to obtain.
"Identifiability" is regarded as a spectrum ranging from something that relates to directly identified/identifiable individuals on one end, to information that is impossible to link to an individual on the other. Where data falls within this spectrum may change over time, depending on the means reasonably likely to be used to identify someone, such as available technologies, as well as factors such as the time and monetary cost that would be required to achieve identification. Context is important here also as organisations have to consider the likelihood of someone trying to identify an individual in a dataset - the higher the likelihood, the more care needed to ensure effective anonymisation.
Both the information and the environment in which is it processed are important factors to consider. Schrodinger would assumedly be very interested in personal data, as the same piece of data may or may not be personal data. Although this is more down to whose hands it is in, and whether they have access to additional information, rather than a radioactive substance and an unfortunate cat.
Where information is going to be released, then organisations must keep identifiability in mind when considering how and where it will be shared - will it be released to the public at large, or a defined group? In short, the wider the release, the more care should be given to ensure anonymisation is effective. Something which the ICO puts forward that may help organisations in this exercise is the motivated intruder test - asking whether a resonable competent intruder with no prior knowledge but with access to appropriate resources and the wish to identify an individual, obtain further information that could be used for re-identification?
The call for views on the draft of Chapter 2 ends on 28 November.