What Good Document Hygiene Looks Like in 2026

Document operations is a low-glamour discipline. There is no Hacker News for filing taxonomy. But in mature organisations, you can recognise good document hygiene by a small number of habits.

Here is the checklist we measure ourselves against. Use it for yours.

Every new document has a classification within 24 hours

AI suggests, humans confirm. The 24-hour SLA is achievable for low-confidence cases through a review queue. Without this, your classification taxonomy decays.

Every classification has a retention policy

If you've classified something as “Customer Contract”, you should automatically know how long to retain it, what happens at expiry, and who reviews. If your retention policy is “indefinitely” by default, you don't have a retention policy.

Every classification has a permission template

Same logic: “Employee Performance Review” should automatically be permissioned to HR + the employee. Don't decide permissions per-document. Decide them per-classification.

Every workflow has an SLA

A workflow without an SLA is just a queue. SLAs make workflows measurable; measurable workflows are improvable.

Every workflow step has a delegate path

If the SLA is 24 hours and the assigned approver is on a 2-week leave, your SLA will breach. Delegates fix this structurally.

Indefinite external shares are the source of nearly every “we don't know who can see what” panic. Set a default expiry of 7 days at the system level.

Every privileged role assignment is reviewed quarterly

Platform Admins, Tenant Admins, Compliance Officers, Auditor — these are sticky roles that accumulate. Audit them every quarter; revoke what's no longer needed.

Every confidentiality classification matches reality

Documents drift. A document classified as “Internal” two years ago may now contain Restricted-class personal data. Run a periodic classification drift audit.

Every search result is checked for permission leakage

When you implement a new search feature, the first test should be: “can user A find documents user A shouldn't see?” If yes, you have a permission leak.

Every backup has been restored at least once

Untested backups are a fairy tale. Run a real restore drill quarterly. Use the result, not just the backup logs.

Every shared link has a kill switch

If a counterparty leaves the project, you should be able to revoke all their share links in one click. If you can't, you've shipped a permission system that's structurally broken.

Every AI extraction has a confidence threshold

Below the threshold, route to human review. Above, accept silently. Without a threshold, you're choosing between trusting all AI (dangerous) or trusting none (wasteful).

Every retention disposition is reviewed before deletion

Auto-deletion at policy expiry is fine if a human reviewer can override within 30 days. Auto-deletion without that grace is how organisations lose evidence they later need.

Every tenant has a documented data flow map

For your DPO. Without this, DSARs become exploratory adventures rather than queries.

You can answer “where does this document live, who has access, and when does it expire” in under 30 seconds

If you can't, you're failing the hygiene test. Tools should make this trivial. If they don't, change tools.

Calibration

Score yourself out of 15. Below 8 — you have foundational work to do. 8-11 — you're on the path. 12-14 — you're doing well. 15 — you're either lying or running Papyrus.