Skip to main content
FAQs

AI Classification and Extraction

How AI tags your documents, what gets extracted, confidence scores, when humans review, and how to correct mistakes.

AI Classification and Extraction

What does AI classification do?

When a document arrives, the AI guesses what kind of document it is — invoice, contract, employee record, memo, board minute, etc. — across 25+ default categories, plus any custom categories your tenant has defined.

What's the accuracy?

Currently ~91% on English documents, ~94% on Swahili, ~89% on mixed-language. We retrain quarterly. See How Papyrus Achieved 94% Classification Accuracy on Swahili Documents.

What if the AI gets it wrong?

Open the document, click the classification chip, choose the correct type. The correction is captured and feeds into the next training cycle. You can also bulk-correct from the Classification Review queue (admin role).

What's a confidence score?

A number 0.0-1.0 indicating how sure the AI is. Below a threshold (default 0.70) the classification is routed to a human review queue. Above the threshold, the classification is accepted silently but always revisable.

What metadata gets extracted?

Depends on the document type. For invoices: invoice number, dates, vendor, KRA PIN, line items, totals. For contracts: parties, effective dates, value, key clauses. For employee records: name, ID, position, dates. The extracted fields appear on the document detail panel.

Can I add custom extraction fields?

Yes (Business+ plans). Define fields per document type — e.g., “Tender Reference” for procurement documents. The AI is fine-tuned to extract these once you've labelled 50+ examples.

Does AI see Restricted documents?

Same RBAC as humans: the AI pipeline runs inside the tenant boundary, sees what users see, and respects classification labels. Restricted documents are processed but their content is not used for cross-document semantic search outside the permission set.

Can I disable AI processing?

Yes, per document type. Admin → Classification Types lets you toggle classification and extraction per type. Useful if you want a particular category processed manually for control reasons.

Rejoining the server...

Rejoin failed... trying again in seconds.

Failed to rejoin.
Please retry or reload the page.

The session has been paused by the server.

Failed to resume the session.
Please retry or reload the page.