Document processing pipeline

Image cleanup. OCR. AI.
Stapling. Format conversion.
One pipeline.

Pull recording packets in from any source, clean and OCR every page, classify and extract instrument-level data with AI, staple into formatted transmissions, and deliver to any title plant. Same pipeline runs as full automation, or as a first-pass extractor in front of your existing keying team.

app.titletools.io · batch processing

TitleTools batch processing — 137 images and 22 documents queued for indexing, with Add Images, Staple, and Export quick actions

Built for

High-volume document operations.

Title plant operators

Daily go-forward posting from your county sources, in your plant's native format. Lower per-instrument cost, faster turnaround, and the same audit trail your team already expects.

Keying-services owners

Use the pipeline as a first-pass extractor in front of your existing keying team. Your keyers verify and correct exceptions instead of typing every field from scratch — throughput per seat goes up without adding heads.

County recorders & aggregators

Standardize ad-hoc scanner output into clean, structured transmissions for downstream consumers. One ingest pipeline, many export formats.

The pipeline

Five stages, one tool.

Acquire → Staple → Index → Examine → Export. Configurable per project — use the whole flow, or just the stages you need.

1

Acquire

ZIP upload, S3 bucket pull, watch folder, or scheduled batch. Bring in raw county scanner output, recorder dumps, mixed-format PDFs, or a daily archive — whatever your source produces.

2

Staple

Group single-page TIFFs into multi-page documents. Auto-Stapler suggests boundaries from layout and content; an operator confirms or adjusts from the keyboard. Image cleanup, deskew, denoise, and auto-rotate run as part of intake.

3

Index

Capture the metadata fields each instrument needs — doc type, instrument number, recording date, book/page, parties. AI auto-fills with confidence scores; the indexer accepts or overrides. Exceptions surface here for human attention.

4

Examine

Verify the extracted data against the source page. A split-screen viewer highlights the exact pixel region each value came from. Exceptions get flagged with notes that carry forward to the transmission.

5

Export

Transmission archive emitted in your plant's format — PropertySync, industry-standard pipe-delimited transmission, custom CSV, or JSON. One pipeline, many delivery targets.

Format-to-format conversion

From any source. To any plant.

Flexible import and export adapters. We support the industry-standard transmission formats — TitleSearch, PropertySync, custom pipe-delimited — and we add new adapters all the time, county by county and plant by plant.

Import adapters

  • Industry-standard transmission formatsTitleSearch and other plant-vendor formats
  • County recording ZIPMulti-page TIFFs + thin EXTRACT*.csv
  • Mixed-format PDF packetMulti-document PDFs auto-split per instrument
  • Bulk S3 / object-store pullContinuous daily ingestion
  • Watch folderDrop files into a tracked directory
  • Scheduled batchPull on a cron from any HTTP / SFTP source
  • Direct uploadPDF, TIFF, PNG, JPG via the workspace

Export adapters

  • PropertySyncDirect API post or JSON payload
  • TitleSearch + other industry-standard transmissionsPipe-delimited docs.txt + STAT.txt + Images/
  • Custom pipe-delimitedField mapping configured per project
  • JSON metadata sidecarFull field capture, per instrument
  • Multi-page TIFF per instrumentPlant-image-bank ready
  • ZIP archive downloadSingle bundle for manual ingest

Don't see your plant's format? Tell us what you need — new adapters land continuously.

Two operating modes

Full automation or first-pass for your keyers.

Same pipeline, two ways to deploy. Pick the one that matches your operation today — you can move between them later.

Full automation

Go-forward daily posting from county source to plant. Documents that classify and extract with confidence above your threshold flow straight to delivery; only flagged exceptions surface for human review.

  • Per-county confidence thresholds.
  • Auto-deliver into the plant via API or scheduled drop.
  • End-to-end intake-to-plant when you don't have or don't want a separate keying step.
  • Same audit trail as a keyed transmission — overrides logged with author and timestamp.

First-pass for keyers

Run the pipeline ahead of your existing keying team. Your keyers open instruments that already have doc-type, parties, dates, and recording info filled in — they verify and correct instead of typing from scratch.

  • Throughput per seat goes up without adding headcount.
  • Keyers focus on hard documents — handwritten, damaged, unusual instruments.
  • Routine instruments (deeds, releases, satisfactions) clear themselves.
  • Every keyer override is preserved alongside the AI suggestion for QA review.

How a batch moves

Five steps. Most of them you'll watch, not drive.

The pipeline runs in the background; you intervene where judgment matters.

1

Acquire from anywhere.

Drop a ZIP into the workspace, point us at an S3 bucket, or schedule a pull. Every batch shows up on one screen with the page count, document count, status, and progress visible at a glance.

  • ZIP, S3, SFTP, HTTP, watch folder, or direct upload.
  • Per-source authentication and rate limiting.
  • Resume on partial failure — no re-ingesting what you already have.
TitleTools — Acquire from anywhere.
2

Staple — group pages into documents.

Recording packets arrive as single-page TIFFs. The Stapler workspace groups them into multi-page instruments — Auto-Stapler suggests boundaries from layout and content, an operator confirms or adjusts from the keyboard. Page-level rotate, delete, and insert without leaving home row.

  • Single-page TIFFs in, multi-page instruments out.
  • Auto-Stapler suggestions confirmed (or overridden) by the operator.
  • Image cleanup runs as part of intake — deskew, denoise, auto-rotate.
TitleTools — Staple — group pages into documents.
3

Index — capture the metadata fields.

For each stapled document, capture the metadata fields the target plant needs — doc type, instrument number, recording date, book/page, grantor, grantee, parcel. AI auto-fills with confidence scores; the indexer accepts or overrides. Exceptions surface here for human attention.

  • Custom field schema per project / county / plant.
  • AI suggestions inline with confidence scores — keyboard-driven accept or override.
  • Field carry-forward across documents (recording date, county, etc.).
TitleTools — Index — capture the metadata fields.
4

Examine — verify against the source.

Open any document and see every extracted field next to the source page region that produced it — bounding boxes drawn over the original scan. Confidence under threshold? Click to override; both the AI suggestion and the correction are preserved in the audit trail.

  • Split-screen viewer with field-to-page citation linking.
  • Override any field; both the AI suggestion and the correction are preserved.
  • Every override carries author and timestamp into the audit trail.
TitleTools — Examine — verify against the source.
5

Export in your plant's format.

When the batch is ready, emit a transmission archive — direct to your plant's API, into a watched drop folder, or as a download. Same audit trail every keyed transmission already carries.

  • PropertySync, industry-standard pipe-delimited, custom CSV, JSON, or ZIP.
  • Plant-image-bank-ready multi-page TIFFs per instrument.
  • Transmission statistics file (STAT.txt) generated automatically.
TitleTools — Export in your plant's format.

Stapler, up close

A keyboard-driven stapling workspace.

Recording packets arrive as single-page TIFFs. The Stapler groups them into multi-page documents — Auto-Stapler suggests where one instrument ends and the next begins, the operator confirms with one keystroke. Exceptions and verification live in Index and Examine — not here.

Hands stay on home row.

A filmstrip of pages on the left, a single-page preview in the middle, document metadata on the right. Arrow keys walk pages, ⏎ closes off the current document, Tab moves through fields, and field values carry forward to the next document so you don't re-type the recording date 50 times in a row.

  • Auto-Stapler suggests document boundaries; you confirm or override with one keystroke.
  • Page-level rotate, delete, insert blank — without leaving the keyboard.
  • Every Auto-Stapler suggestion plus every override is recorded with author and timestamp.

Keyboard shortcuts

Close document
Next / prev page↑ ↓
Next index fieldTab
Accept suggestion
RotateR
Insert blankB
Stapler workspace — page filmstrip on the left, recording-document preview in the middle, document fields and keyboard shortcuts on the right

Why this matters

Throughput without adding heads.

Numbers are indicative ranges, not promises. We'll model your specific cost per instrument against a sample batch of your data on the demo call.

Pages per dollar

↑ 10–40×

Highest leverage on routine instrument types — deeds, releases, satisfactions — where the pipeline runs at full automation.

Per-seat throughput

↑ 3–6×

When deployed as a first-pass extractor in front of an existing keying team. Keyers verify instead of type from scratch.

Time to plant

↓ hours → minutes

Same-day go-forward posting instead of next-day batch turnaround.

Trust

No black-box deliveries.

If your plant or your customer raises a question about a posted document, you can answer it.

Same audit trail as a keyed transmission.

Every classification, every extraction, every override — author, timestamp, AI confidence, original suggestion. Inspectable months later, exportable on request.

Reproducible transmissions.

A delivered transmission can be regenerated bit-for-bit from the source batch and the pipeline version. No mystery deltas if a plant raises a question.

You set the confidence bar.

Per-county, per-doc-type thresholds for what auto-clears versus what surfaces for review. Tune over time as you build trust in specific instrument types.

See it on your data

Book a 30-minute pipeline demo.

Send us a sample county recording packet — even one day's worth — and we'll run it through the pipeline live. You'll see stapling, indexing, examination, and the plant-ready transmission your plant would receive.

We reply from a real person. No marketing automation; no email drip. Your details are not used to train any AI model.

Or email hello@limelyte.com directly.