LLM-Powered Vulnerability Detection and Patching

From DARPA's AI Cyber Challenge (AIxCC) to discovering 62 zero-day vulnerabilities across 26 major open source projects — with 36 already patched upstream.

62 Vulnerabilities Found
26 Projects Targeted
36 Patches Merged
FuzzingBrain Architecture
CRS Web Service
Static Analysis
Worker Services
Submission Service

About Our System

🎯

Autonomous Detection

Our system automatically generates Proofs-of-Vulnerability (POVs) and produces patches for discovered security issues without human intervention.

LLM-Powered

Leverages 23 distinct LLM-based strategies across multiple frontier models from Anthropic, Google, and OpenAI for comprehensive analysis.

Massively Parallel

Deployed across ~100 VMs with thousands of concurrent threads, enabling rapid vulnerability discovery and patch generation.

Technical Approach

System Architecture

FuzzingBrain consists of four core services working in parallel:

  • CRS Web Service: Central coordinator for task decomposition and fuzzer distribution
  • Static Analysis Service: Provides function metadata, reachability, and call path analysis
  • Worker Services: Execute parallel POV generation and patching strategies
  • Submission Service: Handles deduplication, SARIF validation, and bundling

POV Generation Strategies

Delta-Scan Full-Scan SARIF-Based

10 LLM-based strategies for vulnerability discovery, from basic iterative refinement to advanced multi-input generation with coverage feedback.

Patching Strategies

Multi-Model XPatch Path-Aware

13 patching strategies including our novel XPatch approach that generates patches even without POVs.

Key Technical Innovations

🔄 Iterative LLM Refinement

Multi-turn dialogue with structured feedback loops incorporating execution results and coverage data

🎭 Multi-Model Fallback

Resilient architecture with automatic model switching when individual LLMs fail or reach limits

📊 Static/Dynamic Analysis Integration

Call paths, reachability, and real-time coverage feedback to guide vulnerability discovery

Results

4th Place DARPA AI Cyber Challenge out of 7 finalists
62
Total Vulnerabilities
Across 26 open source projects
43
Confirmed
Acknowledged by maintainers
36
Patches Merged
Fixed in upstream releases
<5%
False Positive Rate
Rigorous 4-principle verification
CUPS Ghidra OpenLDAP Apache Avro ImageMagick simdutf UPX BlueZ
Read the full report →

Performance Insights

⏱️ Speed

Sub-5-minute first findings on multiple targets

🎯 Effectiveness

AI-generated harnesses found bugs in 26 of 26 targeted projects

🏗️ Scalability

Campaign scaled from competition VMs to continuous open source fuzzing

Research Team

Ze Sheng

Texas A&M University

Qingxiao Xu

Texas A&M University

Zhicheng Chen

Texas A&M University

Jianwei Huang

Texas A&M University

Matthew Woodcock

Texas A&M University

Heqing Huang

City University of Hong Kong

Alastair F. Donaldson

Imperial College London

Guofei Gu

Texas A&M University

Jeff Huang

Team Lead

Texas A&M University

Open Source & Resources

📂

FuzzingBrain CRS

Complete Cyber Reasoning System implementation with all 23 LLM-based strategies

View on GitHub →
🏆

LLM Leaderboard

Benchmark comparing state-of-the-art LLMs on vulnerability detection and patching tasks

View Leaderboard →
📄

Technical Paper

Detailed technical description of our CRS with emphasis on LLM-powered components

Read Paper →