Anthropic’s safety-first AI collides with the Pentagon as Claude expands into autonomous agents


On February 5 Anthropic released Claude Opus 4.6, its most powerful artificial intelligence model. Among the model’s new features is the ability to coordinate teams of autonomous agents—multiple AIs that divide up the work and complete it in parallel. Twelve days after Opus 4.6’s release, the company dropped Sonnet 4.6, a cheaper model that nearly matches Opus’s coding and computer skills. In late 2024, when Anthropic first introduced models that could control computers, they could barely operate a browser. Now Sonnet 4.6 can navigate Web applications and fill out forms with human-level capability, according to Anthropic. And both models have a working memory large enough to hold a small library.

Enterprise customers now make up roughly 80 percent of Anthropic’s revenue, and the company closed a $30-billion funding round last week at a $380-billion valuation. By every available measure, Anthropic is one of the fastest-scaling technology companies in history.

But behind the big product launches and valuation, Anthropic faces a severe threat: the Pentagon has signaled it may designate the company a “supply chain risk”—a label more often associated with foreign adversaries—unless it drops its restrictions on military use. Such a designation could effectively force Pentagon contractors to strip Claude from sensitive work.


On supporting science journalism

If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Tensions boiled over after January 3, when U.S. special operations forces raided Venezuela and captured Nicolás Maduro. The Wall Street Journal reported that forces used Claude during the operation via Anthropic’s partnership with the defense contractor Palantir—and Axios reported that the episode escalated an already fraught negotiation over what, exactly, Claude could be used for. When an Anthropic executive reached out to Palantir to ask whether the technology had been used in the raid, the question raised immediate alarms at the Pentagon. (Anthropic has disputed that the outreach was meant to signal disapproval of any specific operation.) Secretary of Defense Pete Hegseth is “close” to severing the relationship, a senior administration official told Axios, adding, “We are going to make sure they pay a price for forcing our hand like this.”

The collision exposes a question: Can a company founded to prevent AI catastrophe hold its ethical lines once its most powerful tools—autonomous agents capable of processing vast datasets, identifying patterns and acting on their conclusions—are running inside classified military networks? Is a “safety first” AI compatible with a client that wants systems that can reason, plan and act on their own at military scale?

Anthropic has drawn two red lines: no mass surveillance of Americans and no fully autonomous weapons. CEO Dario Amodei has said Anthropic will support “national defense in all ways except those which would make us more like our autocratic adversaries.” Other major labs—OpenAI, Google and xAI—have agreed to loosen safeguards for use in the Pentagon’s unclassified systems, but their tools aren’t yet running inside the military’s classified networks. The Pentagon has demanded that AI be available for “all lawful purposes.”

The friction tests Anthropic’s central thesis. The company was founded in 2021 by former OpenAI executives who believed the industry was not taking safety seriously enough. They positioned Claude as the ethical alternative. In late 2024 Anthropic made Claude available on a Palantir platform with a cloud security level up to “secret”—making Claude, by public accounts, the first large language model operating inside classified systems.

The question the standoff now forces is whether safety-first is a coherent identity once a technology is embedded in classified military operations and whether red lines are actually possible. “These words seem simple: illegal surveillance of Americans,” says Emelia Probasco, a senior fellow at Georgetown’s Center for Security and Emerging Technology. “But when you get down to it, there are whole armies of lawyers who are trying to sort out how to interpret that phrase.”

Consider the precedent. After the Edward Snowden revelations, the U.S. government defended the bulk collection of phone metadata—who called whom, when and for how long—arguing that these kinds of data didn’t carry the same privacy protections as the contents of conversations. The privacy debate then was about human analysts searching those records. Now imagine an AI system querying vast datasets—mapping networks, spotting patterns, flagging people of interest. The legal framework we have was built for an era of human review, not machine-scale analysis.

“In some sense, any kind of mass data collection that you ask an AI to look at is mass surveillance by simple definition,” says Peter Asaro, co-founder of the International Committee for Robot Arms Control. Axios reported that the senior official “argued there is considerable gray area around” Anthropic’s restrictions “and that it’s unworkable for the Pentagon to have to negotiate individual use-cases with” the company. Asaro offers two readings of that complaint. The generous interpretation is that surveillance is genuinely impossible to define in the age of AI. The pessimistic one, Asaro say, is that “they really want to use those for mass surveillance and autonomous weapons and don’t want to say that, so they call it a gray area.”

Regarding Anthropic’s other red line, autonomous weapons, the definition is narrow enough to be manageable—systems that select and engage targets without human supervision. But Asaro sees a more troubling gray zone. He points to the Israeli military’s Lavender and Gospel systems, which have been reported as using AI to generate massive target lists that go to a human operator for approval before strikes are carried out. “You’ve automated, essentially, the targeting element, which is something [that] we’re very concerned with and [that is] closely related, even if it falls outside the narrow strict definition,” he says. The question is whether Claude, operating inside Palantir’s systems on classified networks, could be doing something similar—processing intelligence, identifying patterns, surfacing persons of interest—without anyone at Anthropic being able to say precisely where the analytical work ends and the targeting begins.

The Maduro operation tests exactly that distinction. “If you’re collecting data and intelligence to identify targets, but humans are deciding, ‘Okay, this is the list of targets we’re actually going to bomb’—then you have that level of human supervision we’re trying to require,” Asaro says. “On the other hand, you’re still becoming reliant on these AIs to choose these targets, and how much vetting and how much digging into the validity or lawfulness of those targets is a separate question.”

Anthropic may be trying to draw the line more narrowly—between mission planning, where Claude might help identify bombing targets, and the mundane work of processing documentation. “There are all of these kind of boring applications of large language models,” Probasco says.

But the capabilities of Anthropic’s models may make those distinctions hard to sustain. Opus 4.6’s agent teams can split a complex task and work in parallel—an advancement in autonomous data processing that could transform military intelligence. Both Opus and Sonnet can navigate applications, fill out forms and work across platforms with minimal oversight. These features driving Anthropic’s commercial dominance are what make Claude so attractive inside a classified network. A model with a huge working memory can also hold an entire intelligence dossier. A system that can coordinate autonomous agents to debug a code base can coordinate them to map an insurgent supply chain. The more capable Claude becomes, the thinner the line between the analytical grunt work Anthropic is willing to support and the surveillance and targeting it has pledged to refuse.

As Anthropic pushes the frontier of autonomous AI, the military’s demand for those tools will only grow louder. Probasco fears the clash with the Pentagon creates a false binary between safety and national security. “How about we have safety and national security?” she asks.



Source link

What to Know About Bigi Jackson, Michael Jackson’s Youngest Child

A £13,607 annual second income for £500 per month? Here’s how it can be done

Leave a Reply

Your email address will not be published. Required fields are marked *