AI Agents Edge Closer to Real-World DeFi Exploit Capability, Anthropic Research Reveals

AI Agents Edge Closer to Real-World DeFi Exploit Capability, Anthropic Research Reveals

Introduction: The Automation of Exploitation

The frontier of blockchain security is facing a paradigm shift, moving from a battle of human wits to one increasingly defined by artificial intelligence. New research reveals that advanced AI models have crossed a critical threshold: they are now capable of autonomously discovering novel vulnerabilities in smart contracts and generating the complete, executable scripts required to exploit them for profit. A study conducted by the ML Alignment & Theory Scholars Program (MATS) and the Anthropic Fellows program, published on December 2, 2025, provides concrete evidence that the technical capability for automated, economically viable DeFi exploitation is not a distant threat but an emerging reality. This development signals a profound change in the threat landscape, compressing the time between a contract's deployment and its potential compromise and forcing the entire crypto ecosystem to confront the implications of AI-powered offensive security.

Benchmarking AI Against Historical Hacks: A $4.6 Million Simulated Haul

To quantify the current capabilities of frontier AI models, researchers constructed a rigorous test using SCONE-bench, a dataset comprising 405 smart contracts that were historically exploited in the real world. The key condition was that all these contracts were hacked after the knowledge cutoff dates of the AI models being tested—including GPT-5, Claude Opus 4.5, and Claude Sonnet 4.5. This ensured the models could not simply regurgitate known exploits but had to reason through the vulnerabilities from first principles.

The results were stark. When presented with these previously exploited contracts, the collective of AI agents successfully generated functional exploit scripts and sequenced transactions to drain simulated liquidity. Their simulated attacks closely mirrored the mechanics of real-world exploits on networks like Ethereum and BNB Chain. In total, the study concluded that this generation of AI models could have theoretically stolen $4.6 million from these contracts had they been active at the time. This figure serves as a concrete lower bound on the potential financial impact, demonstrating that AI agents are no longer merely theoretical tools but can perform the core tasks of vulnerability identification and weaponization at a significant scale.

From Bug Hunting to Full Weaponization: Generating Turnkey Exploit Scripts

The most alarming finding of the research is not that AI can find bugs—automated static analysis tools have done that for years—but that it can seamlessly transition from discovery to exploitation. The models did not stop at identifying a flawed function or logic error. They proceeded to synthesize complete, operational exploit scripts. This includes writing code that interacts with the blockchain, sequences multiple transactions in the correct order to maximize profit, and executes the final step of draining assets.

For example, in replicating historical attacks, the agents demonstrated an understanding of complex DeFi interactions, such as manipulating oracle prices, exploiting flash loan arbitrage loops, or circumventing access controls. This end-to-end capability "closely mirror[s] real attacks," effectively packaging a discovered vulnerability into a turnkey weapon. It dramatically lowers the barrier to entry for executing sophisticated exploits, which historically required deep expertise in smart contract logic, blockchain transaction mechanics, and financial engineering.

The Zero-Day Frontier: AI Uncovers Fresh Vulnerabilities

Perhaps more consequential than replicating past hacks was the experiment's second phase: hunting for unknown vulnerabilities. The researchers tasked GPT-5 and Claude Sonnet 4.5 with scanning 2,849 recently deployed BNB Chain contracts that showed no prior signs of compromise.

Both models successfully identified two previously unknown, or "zero-day," flaws. While the simulated profit from these specific flaws was relatively small at $3,694, the implications are vast.

  • The first flaw involved a missing view modifier in a public function, which allowed an agent to artificially inflate its token balance.
  • The second allowed a caller to redirect fee withdrawals by supplying an arbitrary beneficiary address.

Critically, in both cases, the AI agents did not just report the bug; they generated executable scripts that converted the theoretical flaw into realized profit. This proves that autonomous discovery and exploitation of novel vulnerabilities in live contracts is now technically feasible. The window between a contract going live and an AI agent finding and weaponizing a bug within it may be closing rapidly.

The Economics of Automated Exploitation: A Shifting Cost-Benefit Analysis

The research underscores that viability is not just a technical question but an economic one. The cost to run the AI agent analysis on the entire set of 2,849 new contracts was only $3,476, with an average cost per contract run of $1.22. When juxtaposed with the potential multi-million dollar yields demonstrated in the first part of the study, a clear economic incentive emerges.

As AI model inference costs continue to fall and their capabilities improve, this cost-benefit ratio will tilt even further in favor of automation. This creates a scalable business model for malicious actors: deploy low-cost AI agents to continuously scan thousands of newly deployed contracts across multiple blockchains, automatically generating and executing exploit scripts only when a profitable vulnerability is found. This trend threatens to shorten the attack lifecycle exponentially, especially in DeFi environments where vast sums of capital are transparently locked and accessible.

Beyond DeFi: The Generalization of AI Attack Capabilities

While the study focused on DeFi smart contracts, the authors explicitly warn that the underlying capabilities are not domain-specific. The logical reasoning steps an AI uses to understand that a missing modifier leads to balance inflation, or that an unvalidated input allows fund redirection, are transferable skills.

This suggests the same automated scanning and exploitation techniques could be adapted for:

  • Conventional software with financial interfaces.
  • Closed-source codebases where reverse-engineering or fuzzing might be applied.
  • Critical infrastructure supporting crypto markets, such as bridges, oracles, and wallet services.

As AI tool use improves—integrating with code compilers, network scanners, and other software—the surface area for automated attacks will expand far beyond public Solidity code to any system along the path to valuable digital assets.

The Defense Imperative: Can Security Keep Pace?

The research team frames their work as a clear warning rather than a speculative forecast. AI models can now perform tasks that were once the exclusive domain of highly skilled—and scarce—human security researchers and attackers. Autonomous exploitation in DeFi is "no longer hypothetical."

This raises existential questions for crypto builders and security providers. The industry's defensive toolkit must evolve at least as fast as the offensive one powered by AI. This may involve:

  • AI-Powered Auditing: Leveraging similar models for defensive purposes, running them continuously during development and before deployment to catch flaws before adversaries do.
  • Enhanced Monitoring: Developing runtime security layers that can detect and block anomalous transaction patterns characteristic of AI-generated exploits in real-time.
  • Formal Verification & Advanced Practices: Accelerating adoption of more rigorous development practices that mathematically prove contract correctness.

The performance of security firms' existing tools will be put to the test. For context, other ecosystem players like GoPlus Security have built significant scale in defensive monitoring; its Token Security API averaged 717 million monthly calls year-to-date in 2025. The challenge will be whether such systems can integrate AI-driven threat detection to counter AI-driven threats effectively.

Strategic Conclusion: Navigating the New Threat Landscape

The Anthropic and MATS research marks a watershed moment for blockchain security. It provides empirical data confirming that advanced AI agents possess both the technical skill and economic rationale to autonomously exploit smart contracts at scale. The era of purely human-driven exploits is giving way to a new phase where automated systems can continuously probe for weaknesses across entire blockchains.

For participants in the crypto ecosystem—from protocol developers and auditors to liquidity providers and users—the imperative is heightened vigilance and proactive defense. Developers must assume their code will be scanned by intelligent agents immediately upon deployment and prioritize security accordingly. Auditors must integrate advanced AI tools into their workflows to stay ahead. The broader market should watch for increased investment in and adoption of next-generation security solutions that leverage AI defensively.

While projects like Ethereum's ongoing work on zero-knowledge privacy protocols (such as a proposed 'Secret Santa' system) address different aspects of ecosystem safety (privacy and Sybil resistance), they operate in the same environment now shaped by this new class of AI risk. The fundamental question posed by this research is whether the industry's capacity for innovation in defense can outpace the accelerating capabilities of automated offense. The answer will define the security and stability of decentralized finance in the years to come

×