Skip to content
Annatech_

Applied AI / Document Intelligence

AI validation of US tax-exemption certificates across seven state form families

Global healthcare-technology enterprise - finance operations

US state form families covered
7+
deterministic validation rules
65+
generic checks incl. fuzzy vendor matching
8
disposition: pass / fail / review
3-way

Architecture at a glance

DETERMINISTIC AFTER OCR - EVERY DECISION EXPLAINABLE PDF scans 7+ state form families Azure Doc Intelligence OCR + field extraction 8 generic checks fuzzy vendor match · dates ~65 state rules TX · AZ · GA · OH IL · MA · MI ... checkbox semantics dual-anchor regex PASS FAIL REVIEW per-rule evidence record on every certificate: which check fired, what it saw, why it decided

Context

US tax-exemption certificates arrive as scans of state-specific forms - Texas, Arizona, Georgia, Ohio, Illinois, Massachusetts, Michigan and more, several with multiple form revisions. Finance staff validated vendor identity, exemption numbers, signatures, dates and state-specific fields by hand.

Constraint

A certificate wrongly accepted as valid has direct tax consequences, so the pipeline had to be conservative and - critically - explainable. A black-box "the model said yes" is not an acceptable answer to a tax auditor. Scan quality varies wildly, and each state form has its own required fields and edge cases.

Architecture

Azure Document Intelligence performs OCR and field extraction. Everything downstream is deterministic: eight generic checks run on every document (including fuzzy vendor-name matching to absorb OCR noise, jurisdiction validity, expiration and effective-date anomaly checks), then a state-specific rule set - around 65 rules across the form families - validates required fields, checkbox semantics and free-text descriptions using dual-anchor regex extraction.

Every certificate leaves the pipeline with a three-way disposition and a per-rule evidence record: which check fired, what it saw, why it decided as it did.

Outcome

Machine validation with explainable, auditable results. Humans handle only the review queue, and every automated decision can be defended line by line.

More work

Related case studies

Talk to the person who will actually build it

One architect, end to end: scoping, architecture, delivery, operations. Write a paragraph about your problem and you will get an engineering answer, not a sales call.