The Federal Government's AI Amnesia Problem
There is a specific and fixable failure running through federal AI procurement that GAO’s April 2026 report (GAO-26-107859) surfaces with unusual clarity: agencies are accumulating experience with AI acquisitions and then letting that experience evaporate.
The pattern shows up in concrete cases. VA’s SoKAT program—a natural language processing tool built to scan veterans’ survey responses for indicators of suicidal ideation—was retired in January 2023 after officials concluded it didn’t improve enough over existing solutions to justify the cost. No lessons were documented. VA has multiple other AI programs targeting suicide prevention among veterans. Those programs could have benefited from what SoKAT’s team learned. They didn’t, because it was never written down.
FEMA’s Geospatial Damage Assessment program—which uses machine learning and computer vision to analyze aerial imagery of disaster zones—ran into accuracy problems the team worked through informally. Officials observed that the model struggled to distinguish between different types of dwellings and anticipated performance degradation when applied to imagery from regions different from its training set. They communicated this to the vendor verbally. They didn’t document it. FEMA then renewed the same vendor’s contract for another twelve months. Future FEMA personnel using that contract across the country to support disaster recovery will approach it without the benefit of those early insights.
These aren’t isolated failures of individual programs. GAO found that none of the four agencies in its scope—DOD, DHS, GSA, and VA—had departmental policies requiring systematic collection of lessons learned from AI acquisitions. Officials acknowledged this directly: they didn’t document lessons because nothing required them to. The institutional memory problem is a policy gap, not an attitude problem.
This matters at scale. OMB’s M-25-22, issued in April 2025, specifically directed agencies to share knowledge about AI acquisitions—including standard contract clause language, testing approaches, and other best practices—through a web-based repository that GSA is developing. The mechanism exists. The inputs don’t, because the upstream collection requirement was never written into agency policy.
The contrast with what good practice looks like is visible in the same report. DOD’s Maven program, which uses machine learning and computer vision to analyze geospatial imagery for targeting support, went through its own early failures with poorly specified contracts and inadequate requirements. But NGA officials learned from those failures, documented them, and progressively built more specific requirements into follow-on contracts. They convened a two-day data rights summit. They developed cross-functional teams. They used small initial contracts to test vendors before committing to larger acquisitions. The institutional knowledge compounds.
GSA’s USAi platform—a generative AI service providing chatbot capabilities across the federal government—shows similar evidence of accumulated learning: reused contract terms from previous AI acquisitions, structured pricing models designed to avoid overpayment for simple use cases, and a battery of reliability tests run against competing models over time to evaluate whether price increases in newer versions are justified by performance gains.
The difference between Maven/USAi and the programs that lost their lessons isn’t sophistication or resources. It’s whether institutional memory was treated as a deliverable.
GAO’s four recommendations are narrowly scoped: each agency should update its policies to require systematic collection of AI acquisition lessons learned and submission to the GSA repository. All four agencies concurred. DHS committed to updating its AI procurement process guide by July 31, 2026. VA set August 1, 2026 as its target. DOD and GSA concurred without specifying dates in their public responses.
The broader question is whether the GSA repository itself will become a functional knowledge base or another federal system that agencies nominally submit to without meaningfully using. That outcome depends on whether agencies treat knowledge sharing as an institutional obligation rather than a compliance checkbox—a distinction that policy language alone cannot guarantee, but that policy language is necessary to establish.