Six Ways Federal Agencies Keep Getting AI Procurement Wrong
The GAO’s April 2026 report on federal AI acquisitions (GAO-26-107859) is valuable not just for its top-line findings but for the taxonomy it provides of where government AI procurement consistently breaks down. Based on interviews with officials at DOD, DHS, GSA, VA, and the Department of Commerce, the report identifies six challenge areas—three strategic and three programmatic—that recurred across agencies regardless of the specific AI capability being acquired.
Access to Subject Matter Experts
Every agency in the study flagged this one. AI procurement requires data scientists, software engineers, cybersecurity specialists, and privacy experts at multiple stages—requirements definition, vendor evaluation, and post-award performance monitoring. Without them, agencies make worse contracts and then struggle to hold vendors accountable. Officials at VA’s Automated Decision Support effort said they specifically struggled to build effective evaluation criteria for source selection because the technical expertise simply wasn’t available at the award phase.
Data Ownership and Intellectual Property Rights
Several officials framed this as the most consequential long-term risk. AI outcomes depend on data quality and ownership; agencies that don’t nail down data rights at contract award often find themselves locked out of their own outputs. FEMA’s Geospatial Damage Assessment program discovered post-award that it couldn’t share model outputs with state and federal partners because it hadn’t secured the necessary rights upfront. Officials at DOD’s Maven program convened a two-day data rights summit with legal, program, and contracting staff just to sort through appropriate IP contract language—and still felt they needed more training afterward.
Traditional Acquisition Timelines
FAR-based contracts can take up to two years to award. AI development cycles don’t wait. VA’s senior leaders told GAO that by the time many agencies move from pilot to production at standard pace, they are no longer purchasing innovation. The Army’s Project Linchpin officials were even more pointed: the traditional model of developing a system over five years followed by operations and maintenance is structurally incompatible with how AI actually evolves. FEMA’s Planning Assistance for Resilient Communities program skipped a new contract entirely and leveraged an existing one—accepting weaker AI-specific terms and conditions—just to meet its deadlines.
Requirements Definition and Contract Terms
Without well-defined requirements embedded in contracts, agencies have no leverage over vendor performance. Maven’s early contracts lacked AI-specific performance metrics, making it impossible to hold vendors accountable for weekly deliverables needed under Agile development. VA officials reported their standard security contract language was outdated and insufficient for protecting veterans’ information. Several agencies expressed a desire for government-wide standard AI contract language—something OMB M-25-22 gestures toward but leaves to agencies to implement.
Early Testing and Continuous Evaluation
AI systems degrade. Models trained on one data distribution perform differently when conditions shift—a phenomenon called model drift. Evaluating AI both before contract award and throughout the performance period is essential, but the methods for doing so are still being developed. NIST officials told GAO that universal AI test standards simply don’t exist yet given the diversity and complexity of AI services. GSA’s USAi platform addressed this by running a battery of performance tests against leading models over time—including to assess whether newer, more expensive model versions actually justified their higher prices.
AI Pricing and Overall Cost
Vendors price AI capabilities in ways that are difficult to benchmark, evaluate, or predict. The Army’s XM-30 targeting solution received proposals with AI licensing fees around $300,000 per vehicle per year—which would have exceeded $500 million annually just in licensing for the fleet. The Army rejected that pricing and went looking for more sustainable models. Project Linchpin officials warned that agencies systematically underestimate total cost of ownership by focusing on model training costs while overlooking the cloud infrastructure and compute costs required to sustain AI capabilities over time.
Taken together, these six challenges form a coherent picture: federal agencies are buying a technology they don’t fully understand, under contract frameworks designed for a different era, without the institutional knowledge needed to improve. The OMB guidance issued in April 2025 corresponds to each challenge area and provides a framework—but, as GAO notes, agency-level policy implementation is what actually determines outcomes.