III. Raising the Ceiling of Harm
The enhancement of NSA capabilities within these distinct domains is, on its own, a destabilizing force that is already challenging global security. The proliferation of AI-powered tools represents a terrifying new baseline for violence, one that states are already struggling to counter. This, however, is merely the floor. To focus only on these enhancements is to mistake a linear improvement for an exponential one. In time, AI will evolve from making NSAs more efficient to granting them access to new categories of harm. A dangerous and unpredictable asymmetry will define this next phase of threat. Unconstrained by law or ethics, NSAs can innovate and deploy destructive tactics with an agility that state actors cannot match. It is in this fundamental mismatch in speed and adaptability that AI will raise the ceiling of harm.

These capabilities — smarter drones, more potent malware, more persuasive propaganda, more accessible bioweapons — are the individual instruments of violence. One way the ceiling may rise may be with the emergence of an AI orchestrator capable of conducting these instruments in a multi-domain symphony of chaos. The convergence of these threats would be exponentially more challenging to predict, attribute, and counter than the sum of their AI-augmented components, yet another motivator for AI-enhanced, or AI-orchestrated, attacks. The technical foundations for such an orchestrator are already being laid; agentic AI systems are capable of executing increasingly complex multi-step tasks with increasing degrees of autonomy.

The ceiling for AI-driven cyberthreats is also rising rapidly. AI is used to generate polymorphic malware, which autonomously alters its own code with each new infection, ensuring each instance of the malware has a unique signature and therefore effectively renders traditional signature-based antivirus and detection systems obsolete.  Simultaneously, AI is being integrated with traditional security tools like fuzzers to automate and dramatically accelerate the discovery of novel software vulnerabilities.  We may expect future AI-enabled malware to incorporate reinforcement learning techniques that will allow the malware to optimize itself for high success rates through learning from its environment.

The ceiling for influence operations will also be raised from generating propaganda to conducting dynamic, automated cognitive warfare. Future AI systems will be able to not only create content but also conduct population-level sentiment analysis and react to a population’s psychological state in real-time. We may imagine a web of interconnected agents — each specialized for a specific purpose — conducting adaptive campaigns to monitor online sentiment, A/B test narratives and messaging, and inciting undesired behavior. Or, we may imagine an AI agent tasked with acting as a human recruiter to identify vulnerable individuals online and engage them in long-term, personalized conversations, effectively guiding them down a pipeline of radicalization. Moreover, AI systems will, in time, move from helping exploit known software vulnerabilities to discover and weaponize novel exploits. An NSA may task an AI agent not with a specific target, but with a goal — e.g., “create maximum chaos at a major international hub airport.” The AI scans public code repositories for vulnerabilities in popular open-source libraries used in enterprise-level industrial and logistic systems, finding a remote code execution vulnerability in a widely-used open-source message queue library, a piece of middleware that many disparate, complex systems use to communicate. After monitoring flight data to identify a moment of peak passenger traffic, the AI sends a single, carefully crafted malicious data packet to the airport’s network. Because the baggage, gate, and display systems all rely on this compromised open-source library, the single packet corrupts the message queue for all of them simultaneously, and passenger traffic is halted until the problem can be solved. While this previously would have taken considerable expertise and some careful planning, our future portends this being as easy as writing the prompt.

An even more alarming frontier for NSA exploitation is biosecurity, where AI threatens to neutralize our most critical defenses. A key chokepoint designed to prevent a terrorist from acquiring a pathogen is the screening of custom DNA orders by commercial synthesis firms. This strategy, however, is largely predicated on checking whether an order matches a finite status list of roughly 150 known, regulated agents and toxins.  Generative AI makes this ‘blacklist’ approach dangerously obsolete. AI models, like Evo 2, can now generate harmful protein sequences that are functionally equivalent to known toxins but sequentially different, creating novel biological threats. In a recent study testing this vulnerability, AI-generated harmful variants were passed through the screening software used by major suppliers; in some cases, up to 100 percent of these novel sequences passed through undetected. The implication of this is that an NSA could use AI to design and then effectively mail-order the core components of a bespoke CBRN, engineered to be invisible to the biosecurity apparatus. This necessitates the implementation of improved AI-powered screening tools that, rather than screen a sequence for its resemblance to blacklisted agents and toxins or known threats, predict the intended function and structure of the sequence and its potential for harm. [Such tools have been trialed in DARPA and IARPA programs, but neither the Department of Health and Human Services/Department of Agriculture’s Select Agents and Toxins List nor the Bureau of Industry and Security’s Commerce Control List recommend, or mandate, their use.]


Pathways to open-source weaponization
The popularization of powerful open-source AI models is a key accelerant in raising this ceiling, too, creating an increasingly difficult-to-govern threat vector. Unlike proprietary models that retain some level of developer control through API access, open-source models with publicly released weights grant adversaries complete autonomy to inspect, modify, and operate the system on untraceable, offline hardware, effectively bypassing most proposed regulatory chokepoints.

This creates a dual-use paradox. On one hand, an open and vibrant open-source ecosystem fosters innovation, enhances transparency, and prevents the monopolistic control of a critical technology by a few large corporations or the government. This competition and public scrutiny can lead to more secure and reliable AI systems, a clear benefit for national security. It also benefits the broader innovation economy of the United States and the places where its technologies land. On the other hand, this very openness provides an arsenal for malicious actors who will not adhere to international law or ethical AI norms in their use.

Perhaps the most direct and accessible pathway to weaponization of open-source models is by malicious fine-tuning. An NSA can take a general-purpose, open-source foundation model, such as Meta’s Llama or China’s DeepSeek, and further train it on a small, specialized dataset to optimize it for a specific harmful task. This process is vastly cheaper and faster than training a model from scratch, and bypasses many of the safeguards instituted on data center customers. We may expect an NSA to fine-tune an LLM on a curated dataset of extremist texts to create a propaganda and recruitment bot that speaks with the authentic voice of the ideology. Alternatively, the NSA could fine-tune a model on a repository of malware code to create an assistant for developing cyber or CBRN weapons. Scientists have already demonstrated this vulnerability by fine-tuning the open-source biology model Evo 1 with data on viruses that are infectious to humans.

Another technical pathway is jailbreaking, which involves crafting adversarial prompts designed to bypass a model’s safety features and alignment training. While this technique can be used against closed models via their APIs, it is particularly potent against open-source models, where the underlying architecture and training data can be studied to discover weaknesses. A robust online ecosystem is dedicated to developing and sharing novel jailbreaks, often disseminated publicly on platforms like GitHub. Models with minimal or poorly implemented guardrails, such as DeepSeek, are highly susceptible to these techniques.

Model backdooring provides a more insidious and sophisticated supply chain attack, allowing a malicious actor to embed a hidden trigger and harmful payload into an open-source model before it is publicly distributed. The compromised model would behave normally on all standard evaluations and for most users. However, when it receives an input containing a trigger, it executes a malicious function that could involve leaking data, executing code, or producing a biased and harmful output. These techniques are being readily discovered by researchers in the private market — American security software company Hidden Layer discovered an exploit they dubbed ShadowLogic, which, unlike typical code exploits, can manipulate a model’s computational graph to insert a backdoor that survives fine-tuning, able to execute attacker-defined triggers.  A well-resourced NSA could insert such a backdoor into a popular, widely used open-source model, thereby compromising thousands of downstream applications and systems that are built upon it.