Compliance-aligned underwriting assistant for a working-capital lender — Custom tier
Client: Mid-market fintech (SMB working-capital lender, $50k–$1.5M lines)
Custom-tier build for a regulated lender that needed extraction reliable enough to feed a credit model and auditable enough to explain to examiners. Embedded pod owned model selection, retrieval, eval suite, and the SOC2-aligned audit ledger. Now pre-decisions 92% of applications with a fully cited reasoning trail per field.
- extraction accuracy on credit-model fields
- 99.4%
- median time-to-decision (was 6.1 days)
- 11 hrs
- default-rate delta vs. control (flat)
- 0.2 pts
- decisions with field-level citation to source PDF
- 100%
Challenge
Working-capital lender funding $50k–$1.5M lines. Underwriting required human review of bank statements, tax returns, and AR aging — 40-60 pages per application. Average decision was 6.1 business days against competitors quoting 48 hours, costing roughly a third of qualified applicants. A previous document-AI vendor had 78% extraction accuracy, unusable for a credit-model input where misreading current-period revenue is a six-figure write-off.
The Pro tier wouldn't fit. They needed: (a) extraction at 99%+ on the credit-model fields, (b) field-level citation back to source pages for the audit trail, (c) a kill-switch the chief credit officer could pull on any specific borrower segment, (d) SOC2-aligned logging the bank examiners would accept. Custom tier — embedded pod, 14-week build, 8-week parallel-run validation against the existing underwriters.
Approach
Document-classification layer (statement vs. tax return vs. AR aging) feeding field-specific extractors. Each extractor has its own eval set built from 3,200 historical applications hand-labeled by the senior underwriter. Every extracted field carries a confidence score and a citation to the source page. The credit model refuses to decision when any input field comes in below threshold — those applications route to a human with the low-confidence fields flagged.
Audit ledger sits on Postgres with append-only writes; every decision logs the model version, prompt hash, retrieved context, and field-level citations. Examiners can replay any decision deterministically.
We ran the system in parallel with the human underwriting team for eight weeks. The system's decision was logged but not binding; weekly reconciliation reviewed every disagreement. Those reviews produced 31 material prompt and schema adjustments before flipping to binding mode.
Outcome
92% of applications pre-decisioned without human review. Median time-to-decision: 11 working hours (was 6.1 business days). Default rate within 0.2 pts of historical control — statistically indistinguishable. Application-to-funded conversion up 34%. Eleven of fourteen underwriters redeployed to portfolio monitoring. The audit ledger was accepted by their primary regulator on first review; the field-level citation trail meant no examiner ever asked us to explain a decision twice. Custom-tier engagement transitioned to a $12k/mo on-call retainer at month 6.
Stack
- Claude Opus (extraction + reasoning)
- Custom document classifier (in-house)
- Append-only decision ledger on Postgres
- Airflow + Kafka orchestration
- Field-level eval harness (3,200 labeled apps)
Working on something similar?
A partner will respond personally within one business day. If there isn't a fit, we'll tell you that, and point you somewhere better.