What Good Document Hygiene Looks Like in 2026
A short, opinionated checklist of habits that mature document operations have figured out. Use it as a self-audit.
What Good Document Hygiene Looks Like in 2026
Document operations is a low-glamour discipline. There is no Hacker News for filing taxonomy. But in mature organisations, you can recognise good document hygiene by a small number of habits.
Here is the checklist we measure ourselves against. Use it for yours.
Every new document has a classification within 24 hours
AI suggests, humans confirm. The 24-hour SLA is achievable for low-confidence cases through a review queue. Without this, your classification taxonomy decays.
Every classification has a retention policy
If you've classified something as “Customer Contract”, you should automatically know how long to retain it, what happens at expiry, and who reviews. If your retention policy is “indefinitely” by default, you don't have a retention policy.
Every classification has a permission template
Same logic: “Employee Performance Review” should automatically be permissioned to HR + the employee. Don't decide permissions per-document. Decide them per-classification.
Every workflow has an SLA
A workflow without an SLA is just a queue. SLAs make workflows measurable; measurable workflows are improvable.
Every workflow step has a delegate path
If the SLA is 24 hours and the assigned approver is on a 2-week leave, your SLA will breach. Delegates fix this structurally.
Every external share is time-bounded
Indefinite external shares are the source of nearly every “we don't know who can see what” panic. Set a default expiry of 7 days at the system level.
Every privileged role assignment is reviewed quarterly
Platform Admins, Tenant Admins, Compliance Officers, Auditor — these are sticky roles that accumulate. Audit them every quarter; revoke what's no longer needed.
Every confidentiality classification matches reality
Documents drift. A document classified as “Internal” two years ago may now contain Restricted-class personal data. Run a periodic classification drift audit.
Every search result is checked for permission leakage
When you implement a new search feature, the first test should be: “can user A find documents user A shouldn't see?” If yes, you have a permission leak.
Every backup has been restored at least once
Untested backups are a fairy tale. Run a real restore drill quarterly. Use the result, not just the backup logs.
Every shared link has a kill switch
If a counterparty leaves the project, you should be able to revoke all their share links in one click. If you can't, you've shipped a permission system that's structurally broken.
Every AI extraction has a confidence threshold
Below the threshold, route to human review. Above, accept silently. Without a threshold, you're choosing between trusting all AI (dangerous) or trusting none (wasteful).
Every retention disposition is reviewed before deletion
Auto-deletion at policy expiry is fine if a human reviewer can override within 30 days. Auto-deletion without that grace is how organisations lose evidence they later need.
Every tenant has a documented data flow map
For your DPO. Without this, DSARs become exploratory adventures rather than queries.
You can answer “where does this document live, who has access, and when does it expire” in under 30 seconds
If you can't, you're failing the hygiene test. Tools should make this trivial. If they don't, change tools.
Calibration
Score yourself out of 15. Below 8 — you have foundational work to do. 8-11 — you're on the path. 12-14 — you're doing well. 15 — you're either lying or running Papyrus.