CAISI Signs Frontier AI Testing Agreements With Google DeepMind, Microsoft, and xAI: What You Need to Know

The words Innovation Explained with the ai underlined on gradient background with a data node pattern.The words Innovation Explained with the ai underlined on gradient background with a data node pattern.

The Center for AI Standards and Innovation (CAISI) is a federal body housed within the Department of Commerce’s National Institute of Standards and Technology (NIST). It serves as the U.S. government’s primary point of contact for the AI industry, facilitating voluntary testing, collaborative research, and the development of best practices for commercial AI systems. On May 5, 2026, CAISI announced new agreements with three major AI developers, Google DeepMind, Microsoft, and xAI, to conduct pre-deployment evaluations and targeted research on their frontier AI models.

In this article, we’ll discuss what these new agreements mean for the future of AI oversight in the United States, how they build on earlier partnerships with companies like Anthropic and OpenAI, and why the involvement of an interagency taskforce adds a new layer of depth to the government’s approach to AI security. We’ll also look at CAISI’s origins, its evolving mission, and what this development signals for the broader AI industry.


TL;DR Snapshot

Under the new agreements, Google DeepMind, Microsoft, and xAI will provide their frontier AI models to CAISI for government evaluation before public release. The arrangement also covers post-deployment assessments and ongoing research. According to NIST’s official announcement, CAISI has already completed more than 40 such evaluations, including on state-of-the-art models that have never been released to the public.

Key takeaways include…

  • Expanded government access: AI developers will frequently hand over versions of their models with reduced or removed safeguards so that CAISI can probe for national security risks, including threats related to cybersecurity, biosecurity, and military misuse.
  • Interagency collaboration: Evaluators from multiple government agencies participate through the TRAINS Taskforce, a group of interagency experts from the Departments of Defense, Energy, Homeland Security, and more, all focused on AI national security concerns.
  • Building on earlier partnerships: These new agreements join existing, renegotiated partnerships with Anthropic and OpenAI that were first established in 2024, making CAISI’s testing framework the most comprehensive federal effort of its kind.

Who should read this: Policymakers, AI researchers, cybersecurity professionals, tech industry leaders, and anyone following the intersection of artificial intelligence and national security.


From the AI Safety Institute to CAISI: A Brief History

To understand today’s announcement, it helps to know where CAISI came from. The organization was originally founded as the U.S. AI Safety Institute (AISI) in November 2023, following President Biden’s Executive Order 14110 on safe, secure, and trustworthy AI. Its initial mandate was to develop science-based benchmarks for risk assessment, promote best practices, and guide responsible AI innovation in partnership with more than 200 academic institutions, industry leaders, and global allies.

In June 2025, Commerce Secretary Howard Lutnick renamed and restructured the institute as the Center for AI Standards and Innovation. As Broadband Breakfast reported, Lutnick stated that CAISI would focus on demonstrable risks, such as cybersecurity, biosecurity, and chemical weapons, while also investigating malign foreign influence arising from the use of adversaries’ AI systems. The rebrand signaled a shift in emphasis from broad AI safety to a more targeted focus on national security, innovation, and U.S. competitiveness on the global stage.

Despite the name change, CAISI retained its home within NIST and continued many of the same core functions. Partnerships with Anthropic and OpenAI, first signed in August 2024, were renegotiated to align with CAISI’s updated directives and the Trump administration’s AI Action Plan. Today’s expansion to include Google DeepMind, Microsoft, and xAI represents the most significant growth of the program since its inception.

What the New Agreements Actually Entail

According to NIST’s press release, the agreements enable two primary activities: pre-deployment evaluation and post-deployment assessment of frontier AI models.

Illustration of a shielded AI chip connected to server nodes, with a federal building, lock, radar display, and U.S. map symbolizing government testing of frontier AI for national security.

In practice, this means each company will provide CAISI with access to its latest AI models before those models are released to the public. To allow for thorough national security testing, developers will frequently share versions of their models with safety guardrails reduced or entirely removed. This is a critical detail, testing AI systems with their safeguards intact can only reveal so much. Stripping those protections allows government evaluators to understand the raw capabilities and risks of a model, including potential misuse in areas like cyberattacks, weapons development, and disinformation campaigns.

Microsoft also committed to working with U.S. government scientists to test AI systems in ways that probe for unexpected behaviors, and to developing shared datasets and workflows for evaluating its models. Microsoft has additionally signed a parallel agreement with the UK’s AI Security Institute.

The agreements were drafted with flexibility in mind, designed to adapt quickly as AI technology continues to evolve. They also support testing in classified environments, a detail that underscores the national security dimension of the initiative.

The TRAINS Taskforce: A Whole-of-Government Approach

One of the most important features of CAISI’s testing framework is the involvement of the Testing Risks of AI for National Security (TRAINS) Taskforce. Originally established in November 2024, the TRAINS Taskforce brings together subject-matter experts from across the federal government to collaborate on AI evaluations.

According to the Department of Energy’s announcement at the time of its founding, the taskforce includes representation from the Department of Defense (including the Chief Digital and AI Office and the National Security Agency), the Department of Energy and ten of its National Laboratories, the Department of Homeland Security (including CISA), and the National Institutes of Health.

These agencies each contribute their own specialized expertise. For example, the Department of Energy’s national laboratories have deep experience in nuclear and radiological security, while CISA contributes cybersecurity evaluation capabilities. The result is a layered, multi-domain approach to assessing frontier AI risks that no single agency could replicate on its own. As Axios noted, the expansion of these agreements comes at a time when concern is growing in Washington over the national security risks posed by increasingly powerful AI systems, particularly their potential to enable sophisticated cyberattacks.

What This Means for the AI Industry

Illustration of a glowing AI chip inside a testing chamber, with data lines connecting generic server buildings to a federal government building, symbolizing voluntary AI industry collaboration with government evaluation.

These agreements shows that the U.S. government is serious about maintaining visibility into the capabilities of frontier AI systems, and the major AI labs are voluntarily participating. With Google DeepMind, Microsoft, and xAI now joining Anthropic and OpenAI, CAISI’s testing program covers all of the most prominent names in the field.

It’s worth noting that these agreements are voluntary, not regulatory mandates. The framework is built around information sharing and collaborative research rather than enforcement. As PYMNTS reported, CAISI works with the industry on testing, collaborative research, and best practice development. The agreements are designed to drive voluntary product improvements while giving the government a clear picture of AI capabilities and where the U.S. stands in the international AI competition.

For companies operating in the AI space, this development suggests that pre-deployment government testing could become an industry norm, even without formal regulation. For national security professionals, it represents a meaningful step toward closing the gap between the pace of AI development and the government’s ability to evaluate its implications.


Frequently Asked Questions

CAISI stands for the Center for AI Standards and Innovation. It’s a federal body within the National Institute of Standards and Technology (NIST), part of the U.S. Department of Commerce. CAISI serves as the government’s main point of contact for the AI industry, focusing on voluntary testing, research collaboration, and developing best practices for commercial AI systems.

The U.S. AI Safety Institute (AISI) was the predecessor to CAISI. It was established in November 2023 under President Biden and was housed within NIST. In June 2025, Commerce Secretary Howard Lutnick renamed and restructured it as CAISI, shifting its mission toward national security, standards development, and U.S. AI competitiveness.

TRAINS stands for Testing Risks of AI for National Security. It’s an interagency taskforce originally convened in November 2024 that brings together experts from the Departments of Defense, Energy, Homeland Security, and Health and Human Services (via the National Institutes of Health). The taskforce collaborates on AI evaluation methods, benchmarks, and joint national security risk assessments.

The National Institute of Standards and Technology (NIST) is a federal agency within the U.S. Department of Commerce. It promotes innovation and industrial competitiveness by advancing measurement science, standards, and technology. NIST is the organizational home of CAISI.

Frontier AI refers to the most advanced and capable AI models currently being developed. These are systems that push the boundaries of what AI can do, and their capabilities are not yet fully understood. Because of their potential power, frontier models attract heightened scrutiny from governments and researchers concerned about misuse.

America’s AI Action Plan is a policy framework from the Trump administration that outlines the government’s strategy for AI development, competitiveness, and security. CAISI’s directives and its agreements with AI companies have been aligned to reflect the priorities set out in this plan.


Other Enterprise AI Articles You May Be Interested In

HUMAIN ONE: The First Enterprise Operating System for Autonomous AI Agents, Powered by AWS

AI-Powered Vehicle Design: Inside IBM and Dallara’s New Collaboration

The Pentagon’s New AI Deal with Google: What It Means for Military Tech, Big Tech Ethics, and National Security

Nvidia Nemotron 3 Nano Omni: A Unified Open Model for Vision, Audio, and Text

NetSuite SuiteConnect 2026: New AI Coding Agent Skills for SuiteCloud Developers