Site icon theInspireSpy

What is anonymization?

anonymization

 

What is Anonymization?

It is an operation that consists of removing, or replacing, information (personal or other data) of different natures and origins, before transmitting them to third parties (as with the Open Data movement, outsourcing, etc).

Clearly, information is replaced by values ​​that no longer carry any information, these values ​​not making it possible to find the original information.

Anonymizationthe term most used in the EU, is sometimes referred to (depending on motivation or context) by:

But why should we anonymize?

Reason for anonymization

At the origin of this anonymization operation, there are various motivations:

We will not expand on the applicable texts, dealing with the subject of the protection of privacy, which range from the Charter of Fundamental Rights (of the European Union) to other Laws.

With personal data, two main families of data must be distinguished:

That said, there are two important questions:

Here are some answers.

What should be anonymized?

To this first question, the answer is (apparently) simple: all information that makes it possible to identify a natural person directly or indirectly, that is to say including by cross-checking or cross-referencing with other information of the same source, or from other sources (even external).

Classic examples: first name, last name, street, town, age, date of birth, telephone numbers, email address, etc.

On a specific IT project, it is, therefore, necessary to carefully analyze and designate in a precise and unambiguous manner all the information that will have to be anonymized.

This task, which is the responsibility of the project data manager is crucial because the slightest oversight can have unfortunate consequences: once the data has been outsourced, it will be too late.

How do we anonymize?

To this second question, you will undoubtedly be somewhat disappointed, but unfortunately, there is no miracle solution.

It is an IT project like any other, which can of course be based on software solutions, which must however be chosen with circumspection, and of which it will be necessary to be aware of the limits. The essential steps in an anonymization project are as follows:

Essential steps in an anonymization project :

 

  1. Start by identifying which data sources must be anonymized (database(s), incoming and/or outgoing flow(s), document model(s), document(s), etc.).
  2. Choose whether the anonymization should be performed on the fly or in bulk, for each source. The first (on the fly) is complex, the second is the most common case at the database level.
  3. Then identify and precisely locate (for each source) all the data to be anonymized.
  4. Define, according to each type of data the Document anonymization technique to be used (see below), and the business and technical constraints . There are many, we will mention only the most common and relevant in the next chapters.
  5. Once this is done, if the technical solution has not yet been selected, the specifications of the previous points will make it possible to choose a technical solution wisely (which therefore makes it possible to implement all the rules defined).
  6. Implement the anonymization process (development of a specific tool, or configuration and development using a software solution from an editor).
  7. Carry out an in-depth recipe with data close to production. This is not a usual recipe for a classic IT project: the slightest omission of data (not anonymized), or poor quality anonymization (reversible process for example), will be irreversible once in production.
  8.  Indeed, in production, the data can be consulted, copied, and used for an attempt at de-anonymization, even if the process is corrected afterward. The recipe must therefore be carried out in a specific environment independent of production.

This development process can obviously be done in an iterative and agile way.

Anonymization techniques

The basic anonymization techniques, which can be combined with each other under certain conditions, must be well understood, the guarantees they can provide (or not):

 Techniques that are generally safe

 

It is understood that several elementary techniques can be

combined to anonymize a type of data and that certain combinations are of no interest.

Example: Performing a hash followed by masking would be useless.

 

 

Exit mobile version