AI-powered financial auditing: from 10 months to hours
10 months → hours
audit execution time
40%+
of federalized funds reviewed with the platform
Minimal
human error in massive analysis
Thechallenge
A supreme audit institution in Mexico — with thousands of auditors and responsibility for reviewing federal public spending — received documentation from audited entities in boxes and binders: account statements, invoices, accounting vouchers, and records reviewed by hand.
Digitize and automate the massive analysis of financial documentation; absorb infrastructure demand peaks during statutory audit periods; free auditors from repetitive tasks so they can focus on analysis.
Thesolution
We developed and implemented an AI document processing platform (our ScriptoMind solution) built 100% on Google Cloud: Cloud Storage and Cloud SQL for data, Vertex AI and the Gemini API for analysis, Document AI and Vision AI for automated reading of unstructured documents. The process began with a proof of concept in 2023 and went live in the area responsible for reviewing more than 40% of federal funds transferred to states and municipalities. Continuous support from Nuuptech and Google Cloud throughout the two-year implementation.
Vertex AI
Gemini
Document AI
Vision AI
Cloud SQL
Cloud Storage
Results
10 months → hours
audit execution time
40%+
of federalized funds reviewed with the platform
Minimal
human error in massive analysis
Elastic scalability
during audit peaks with no physical infrastructure investment
nuup://replay — decision mode
decision 1/4
decision 1 of 4: How do you introduce AI into an institution built on paper?
Thousands of auditors, statutory processes and zero room for a visible failure: the first decision defined the credibility of the whole project.
your call
A big-bang rollout would have bet the credibility of the project on a system not yet validated with real documents. Any stumble in production would have burned internal trust before proving value.
✓ real decision — you called it
what was done
The 2023 proof of concept validated the technology on real documentation with no operational risk. With evidence in hand, the platform went live where it mattered most: the area that reviews more than 40% of federalized funds.
your call
Digitizing without a clear use case produces terabytes of images, not faster audits. The value was in automated analysis, not in mass scanning.
Lesson: $ git commit -m "feat: pilot small, deploy where it matters most"
decision 2/4
decision 2 of 4: Rules engine, end-to-end LLM, or a hybrid?
An audit finding admits no hallucinations; but a rules engine alone cannot read a scanned accounting voucher.
your call
Traditional OCR breaks on stamps, heterogeneous formats and unstructured documents. Rules are auditable but blind: without robust document reading there is nothing to evaluate.
your call
Delegating the full verdict to a generative model introduces an unacceptable risk in government auditing: conclusions that cannot be traced. An audit finding must be defensible rule by rule.
✓ real decision — you called it
what was done
AI does what rules cannot — read unstructured documents at scale — and the domain rules do what AI should not: issue the verdict. Every conclusion stays traceable to the rule that produced it.
Lesson: $ git commit -m "fix: AI reads, business rules decide"
decision 3/4
decision 3 of 4: Where does the platform run: own hardware or the cloud?
Sensitive financial information on one side; extreme demand peaks during statutory audit periods on the other.
your call
Sizing your own infrastructure for the peak means paying for it all year. The capacity the audit period demands would sit idle the rest of the time — or worse: fall short exactly when the law sets the deadlines.
✓ real decision — you called it
what was done
Cloud elasticity absorbs the statutory peaks with no investment in physical infrastructure and releases capacity when idle. Security was solved with architecture and controls, not by avoiding the cloud.
your call
Splitting the platform in two doubles the operation and turns the network into the bottleneck of massive analysis. The complexity of keeping both worlds in sync outweighs the perceived benefit.
Lesson: $ git commit -m "perf: elasticity is rented, not bought"
decision 4/4
decision 4 of 4: Which metric declares the project a success?
A system can process millions of pages and still change nothing in the real audit operation.
your call
Volume is a vanity metric: processing millions of pages does not guarantee better audits. A machine indicator says nothing about the work of the auditor.
your call
Framing AI as a staff cut would have buried adoption: nobody eagerly feeds the system that is coming for their job. The auditors were the key users, not the cost to eliminate.
✓ real decision — you called it
what was done
The metric that mattered was the full cycle: audits that took 10 months now run in hours once the information is fed in. Auditors left the repetitive work to focus where they add judgment: the analysis.
Lesson: $ git commit -m "docs: measure the full cycle, not the throughput"
replay complete
You matched 0 of 4 decisions.
This is what a project looks like from the inside: architecture decisions with consequences. If you are facing something similar, let us talk.