Learn / Data retention

Data retention policy: how long can you keep personal data?

GDPR's storage-limitation principle says you may keep personal data only as long as you actually need it. A data retention policy turns that principle into concrete schedules — and defensible, provable deletion when the clock runs out.


The principle

Storage limitation, in plain terms.

GDPR Article 5(1)(e) requires that personal data be kept in a form that permits identification of individuals for no longer than is necessary for the purposes it was collected for. In other words: once you no longer need the data for a legitimate purpose, you are supposed to delete or anonymize it.

There is no universal number of years. 'How long is too long' depends on the purpose, the legal basis, and any overriding legal obligation to keep records — tax, employment, and financial rules commonly set their own minimum retention periods that sit on top of the privacy default.


Building a policy

From principle to schedule.

01 / Inventory the data

List the categories of personal data you hold and where they live. You cannot set a retention period for data you have not mapped.

02 / Assign a purpose and period

For each category, record why you hold it and how long that purpose — or a legal obligation — justifies keeping it. Document the reasoning, not just the number.

03 / Automate deletion

Turn each period into a scheduled, repeatable deletion job rather than an annual manual clean-up that quietly slips.

04 / Prove it happened

Keep evidence that data was deleted on schedule. Under the accountability principle, being able to demonstrate the deletion matters as much as doing it.


The warehouse problem

Retention is hardest in the data warehouse.

In a transactional database, retention can be a scheduled DELETE. In an analytics warehouse it is far messier: the same record has been copied into staging tables, joined into marts, exported to dashboards, and captured in backups and time-travel windows. Deleting the source row leaves the copies behind, so the data is not really gone when your policy says it should be.

This is why retention and the right to erasure are the same engineering problem. Both require you to reach every copy of a person's data and produce evidence that it is gone. Crypto-shredding solves both at once: encrypt personal fields on write, and when a retention period or an erasure request lands, destroy the key so every copy — including the ones in backups — becomes unreadable in a single, provable step.


FAQ

Common questions

How long can you keep personal data under GDPR?

Only as long as necessary for the purpose it was collected for, unless a specific legal obligation requires you to keep it longer. GDPR sets no fixed maximum; you must define and justify a retention period per category of data, then delete or anonymize it when the period ends.

What is a data retention policy?

A documented set of rules stating, for each category of personal data, how long it is kept, why, and what happens when the period expires. A good policy pairs each period with an automated deletion process and a way to prove deletion occurred.

Does data retention apply to backups?

Yes. Backups and warehouse time-travel windows hold copies of personal data and are in scope for retention and erasure obligations. Because you usually cannot surgically edit a backup, crypto-shredding — where the retained copies are ciphertext that becomes useless once the key is destroyed — is a practical way to honour retention without rebuilding every snapshot.


Keep reading

How to produce evidence that data was removed on time, not just assert it.