Learn / Data protection

Pseudonymization vs. anonymization: the GDPR difference.

The two are constantly confused, and the confusion is expensive: anonymized data falls outside GDPR entirely, while pseudonymized data is still personal data with all the obligations attached. Here is the line between them — and why encryption sits firmly on the pseudonymization side.


The definitions

Same goal, very different legal status.

Both techniques reduce the visibility of identities in a dataset. The difference is whether re-identification remains possible.

Pseudonymization

Replace identifying fields with a token, hash, or ciphertext, while keeping a separate mapping (a key or lookup) that can reverse the process. The data can still be linked back to a person — so under GDPR it is still personal data.

Anonymization

Transform the data so that no one — including you — can reasonably re-identify an individual, and there is no key to reverse it. True anonymization takes the data out of GDPR scope entirely.

The deciding test

Ask: could this data be linked back to a person using means reasonably likely to be used? If yes, it is pseudonymized (still regulated). If genuinely no, it is anonymized (out of scope).


The common mistake

Encryption is pseudonymization, not anonymization.

Encrypting a person's email with a key you still hold does not anonymize it — you can decrypt it any time, so it can be linked back to them. GDPR is explicit that encryption and tokenization are pseudonymization techniques. They are excellent security measures, and GDPR actively encourages them, but they do not remove the data from scope.

This matters because teams sometimes assume that because a warehouse column is encrypted, it is 'not really PII' and can be excluded from deletion workflows. That assumption fails an audit. Encrypted personal data is personal data, and it must be included in access and erasure requests.


Where crypto-shredding fits

Turning pseudonymized data into erased data.

Here is the elegant part. Because encrypted data is pseudonymized precisely because the key still exists, destroying the key changes its status. Once the per-user key is gone and genuinely unrecoverable, the ciphertext can no longer be linked to the person by any reasonable means — which is exactly the standard for erasure.

So crypto-shredding is the bridge: encrypt on write (strong pseudonymization that protects data at rest and in backups), then destroy the key on a deletion request (which renders every copy permanently inaccessible). European Data Protection Board guidance accepts key destruction as satisfying the erasure obligation when the encryption is strong and the key is truly unrecoverable.


A caution on anonymization

Real anonymization is harder than it looks.

Stripping names is not anonymization. Famous re-identification studies have shown that a handful of indirect identifiers — a postal code, a birth date, a gender — can uniquely pick out most individuals in a supposedly anonymous dataset. Aggregation, generalization, and techniques like k-anonymity or differential privacy exist because naive anonymization routinely fails.

The practical takeaway: treat anonymization as a high bar you must actively prove, and treat everything short of it — including encrypted and tokenized data — as personal data that stays inside your privacy obligations.


FAQ

Common questions

What is the difference between pseudonymization and anonymization?

Pseudonymized data can still be linked back to a person (a key or mapping exists to reverse it), so it remains personal data under GDPR. Anonymized data cannot reasonably be re-identified by anyone and has no reversal key, so it falls outside GDPR. The presence or absence of a realistic path to re-identification is the deciding factor.

Is encrypted data anonymized under GDPR?

No. Encryption is a pseudonymization technique because the data can be decrypted with the key. Encrypted personal data is still personal data and must be included in access and deletion requests. It only becomes effectively erased once the key is destroyed and unrecoverable.

Does pseudonymized data still fall under GDPR?

Yes. GDPR explicitly states that pseudonymized data which can be attributed to a person by using additional information remains personal data. Pseudonymization reduces risk and is encouraged, but it does not remove GDPR obligations.


Keep reading

Why identifiability — not the label on a field — decides what counts as personal data.