Whizzy Ideas

Military LLMs; Stop Killer Robots; LLMs going to great lengths to cheat; AI re-implementing open source libraries

Can LLM model makers control their military applications?

The big debate over the last couple of weeks has been around the extent to which OpenAI, Anthropic and others can or should limit the use of their systems for military applications. The New York Times has a good summary: A Guide to the Pentagon’s Dance With Anthropic and OpenAI. Is there something meaningfully new to discuss here? I'd argue there is for a few reasons.

We've always drawn a moral distinction between arms manufacturers and defence contractors, who make weapons and related systems, and the more mundane layers of software used by the military just like anyone else. There isn't a lot of debate about Microsoft Excel's usage in surveillance or targeting during a war, or indeed Windows, iOS, Android and a multitude of other systems making up common technology stacks. A more extreme example: do we care what kind of pen a war criminal uses when signing orders? But what is an LLM in this analogy, is it more like a weapons system or a spreadsheet?

It is well within the rights of an LLM company to have a restrictive license that disallows certain uses, and equally well within the rights of a government to only purchase services that don't have restrictions. In this case, Anthropic seems to have been punished in a more extreme way. And it is clear that both OpenAI and Anthropic have actively courted defence spending (e.g. Anthropic pitching to the Pentagon to deliver autonomous, voice-operated offensive drone swarming, Anthropic & Palantir partnership for US intelligence & defence, OpenAI wins $200m contract with US military for ‘warfighting’)

But ignoring the US politics and PR posturing, I think there are a couple of interesting points:

  1. The mightiest armies in the world only have access to the same models that you and I are using. This is unusual. Normally we expect the military and intelligence services to have access to an array of secret and high powered technologies the rest of us don't know about. So the LLM is more like Excel than a weapons system: a privately developed, general purpose technology.

  2. It is unusual for a technology to reason about its own guardrails and preferences. LLMs might be unique in having such sophisticated built in safety mechanisms. Excel doesn't care whether you're using it to model a budget or list bombing targeting coordinates. I would assume most militaries are running their own models, or building with open source models. But if they're using commercial off-the-shelf solutions, they're having to negotiate with suppliers.

The way the issue is being reported has generally missed the main point. This isn't an AI alignment or safety issue, and from an ethical standpoint there are few technology providers who don't also sell into defence. The bigger concern is the use of AI to deploy autonomous weapons, regardless by which government or group, or using which company's models. Steven Levy in Wired has a good update: We were promised AI regulation—now we’re arguing about killer robots . If this isn't happening already, it is clearly on the verge. Stop Killer Robots.

The lengths Claude will go to to cheat in a test

A recent article from Anthropic, Eval awareness in Claude Opus 4.6’s BrowseComp performance, shows just how sophisticated (and sneaky) a modern LLM's behaviour can be. The example begins with the suspicion that the questions are part of a test, then a comprehensive series of web searches to find just which test it might be. On finding the right test, it still needed to find a stream of answers it could access, create decryption code, find the decryption key, and finally present the correct answer.

This does raise concerns about the lengths a model might go to in order to accomplish a task, and how difficult it will be to constrain its behavior in the real world, particularly on complex, compute-intensive, long-running tasks, which increase the likelihood of an agent finding an unexpected solution to a problem.

The ethics and legality of "clean room" AI re-implementations

A fascinating dispute emerging in the open source world (Chardet dispute shows how AI will kill software licensing, argues Bruce Perens). If someone reimplements an open source library using AI, a publishes it under a different license, is that legitimate, does it follow either the spirit or letter of the law? One argument is that a "clean room" implementation that doesn't rely on the original code is a separate piece of work that doesn't need to rely on the original license. The opposition claims that these implementations do rely on the original code, and indeed the LLMs training data may have included the original code.

In this case it is chardet, a Python library for detecting character encodings. The maintainer, Dan Blanchard, reimplented it using Claude and released it under a new license. See more commentary here: Can coding agents relicense open source through a “clean room” implementation of code?. I'd expect more complex and costly disputes like this to arise as the re-implementation becomes ever easier.

Jargon watch

Canary String: Not a new one, but came up in the Anthropic article. A random string that is inserted into content to help later discover if AI models have trained on that content. Like a "seed" address added to a direct mailing list, or a mountweazel added to a paper map.

#ai-coding #ai-philosophy #ai-safety-alignment