A University Faculty Manages 30,000 Theses
One of Kenya's largest universities digitised three decades of postgraduate theses — and made them searchable, retrievable, and citation-ready.
A University Faculty Manages 30,000 Theses
"Alumni used to wait three weeks for a transcript. Now: 48 hours." — Faculty Registrar
The numbers
- 30,000+ postgraduate theses ingested
- 3 decades of historical archive digitised
- 48 hours turnaround on alumni transcript requests (down from 3 weeks)
- 94% thesis classification accuracy after fine-tuning
The scope
A Faculty of Postgraduate Studies at one of Kenya's largest public universities. Theses spanned 30+ years across:
- Masters dissertations
- PhD theses
- Postdoctoral research reports
- Examiner reports
- Defence panel records
The physical archive occupied two rooms; the digital archive lived on a network drive with inconsistent metadata.
The challenge
Three audiences needed access:
- Current students researching prior work (couldn't find it; resorted to Google Scholar even for in-house work)
- Faculty building bibliographies (similarly struggled)
- Alumni requesting copies of their own theses for further academic / professional use (3-week turnaround)
Plus an organisational requirement: CUE accreditation reviews ask for thesis-related records and would not accept the existing archive as compliant.
The migration
A two-phase digitisation + ingestion programme over 14 months:
Phase 1 — Born-digital theses (months 1-3): The last 8 years of theses already existed in PDF on the network drive. Ingested directly, metadata extracted via AI (title, author, supervisor, year, abstract, keywords).
Phase 2 — Historical scanning (months 4-14): Outsourced scanning bureau processed the physical archive. Roughly 22,000 historical theses scanned at 300 DPI, OCR'd, ingested. Spot-checking at 2% sampling rate confirmed quality.
The outcomes
- Student research: Theses now appear in semantic search by topic, not just by author/year. A student researching “drought-resistant maize hybrids” surfaces relevant historical work without remembering specific titles.
- Faculty bibliographies: Plagiarism checks against the in-house corpus added a layer beyond commercial tools.
- Alumni transcripts: Transcript requests now follow a Papyrus workflow — identity verification, registrar approval, signed PDF delivery via secure share link. Turnaround averages 48 hours.
- CUE accreditation: The next review accepted the digital archive as compliant. The dean's response: “Finally.”
What stalled
Some faculty members were uncomfortable with their early-career work becoming easily searchable. The faculty addressed this with a clear access policy:
- Public: thesis title, author, abstract (always)
- Faculty-internal: full thesis content
- External (alumni / institutional): on request via the transcript workflow
This is the policy that already governed physical access; the digital version simply enforced it consistently.
Quote
"The 1996 thesis cited in last year's PhD defence — that was found in Papyrus. Twenty-seven years on a shelf, then suddenly relevant." — Faculty Dean