Definition

What is OCR?

Optical Character Recognition (OCR) is a technology that converts text from images, PDFs and scanned documents into machine-readable, structured data.

How OCR works

At a high level, OCR pipelines follow three steps:

1. Image preprocessing
The document is deskewed, denoised and binarised. Contrast is normalised so the next step can reliably distinguish characters from the background.
2. Character detection
The engine segments the page into regions, lines and individual character candidates. Modern systems use neural networks to locate text in complex layouts, including rotated and handwritten content.
3. Text extraction
Detected character shapes are classified into Unicode characters, then grouped back into words, lines and paragraphs. The output is plain text and, in modern stacks, structured fields (dates, totals, line items).

Traditional OCR vs AI/LLM OCR

Traditional OCR engines (Tesseract, ABBYY) focus on character recognition and hand the output to downstream rule-based parsers. AI/LLM OCR combines recognition with language understanding: the same model reads the document and returns structured fields directly, without per-vendor templates.

Aspect	Traditional OCR	AI / LLM OCR
Setup	Templates per vendor	Zero-shot, no templates
Output	Raw text + bounding boxes	Structured fields (JSON)
Handles new layouts	Poorly without retraining	Natively
Reasoning	None	Contextual (infers missing fields)

Common use cases

Invoices and receiptsExtract vendor, dates, totals, tax and line items for automated bookkeeping.
Identity documentsRead passports, driver\u2019s licences and national IDs for KYC onboarding.
Forms and contractsDigitise signed forms, purchase orders and contracts for workflow automation.
Archive digitisationConvert historical paper archives into searchable, full-text databases.

Accuracy factors

OCR accuracy depends on several factors, including:

Image quality \u2014 resolution, lighting, skew and compression.
Font and language \u2014 standard fonts and Latin scripts outperform handwritten or non-Latin scripts.
Layout complexity \u2014 multi-column pages, tables and stamps challenge traditional engines.
Model type \u2014 modern AI OCR reasons about context, so missing or partially-occluded fields can still be recovered.

Related terms

Invoice data entry \u2014 the downstream process that consumes OCR output.
Accounts payable automation \u2014 the broader workflow OCR sits within.
3-way matching \u2014 a downstream control that relies on accurate OCR data.

See AI OCR in action

Zerentry uses AI/LLM-powered OCR to extract structured fields from any invoice or receipt \u2014 no templates required.

Explore OCR software Start free

What is OCR?

How OCR works

1. Image preprocessing

2. Character detection

3. Text extraction

Traditional OCR vs AI/LLM OCR

Common use cases

Accuracy factors

Related terms

See AI OCR in action

Further reading

AI OCR software

Invoice data entry

Zerentry vs Mindee

What is OCR?

How OCR works

1. Image preprocessing

2. Character detection

3. Text extraction

Traditional OCR vs AI/LLM OCR

Common use cases

Accuracy factors

Related terms

See AI OCR in action

Further reading

AI OCR software

Invoice data entry

Zerentry vs Mindee