From Competition to Real-World Impact
After finishing 4th place in DARPA's AI Cyber Challenge (AIxCC) -- where we discovered 28 vulnerabilities including 6 zero-days in competition targets -- we asked ourselves a straightforward question: what happens when we point this system at the broader open source ecosystem?
The answer: 62 new vulnerability discoveries across 26 open source projects, spanning everything from printing infrastructure (CUPS) to reverse engineering tools (Ghidra), IoT protocols (OPC-UA, Mosquitto) to data serialization libraries (Apache Avro, Flatbuffers). Of these, 43 have been confirmed by maintainers and 36 have already been patched in upstream releases.
This post details what we found, how we found it, and what the experience taught us about the state of memory safety in open source software.
Severity Distribution
Every vulnerability was triaged using standard CVSS-based severity ratings. The distribution reflects the kinds of bugs that fuzz testing excels at finding: memory corruption issues in C/C++ and input validation failures in Java.
What We Found: Vulnerability Types
The vulnerabilities span a wide range of bug classes. Heap buffer overflows and NULL pointer dereferences dominate, but we also found higher-level logic bugs like path traversals, decompression bombs, and type confusion issues in Java code.
Popular Open Source Projects
We targeted popular open source projects that are widely deployed, security-critical, and accept community bug reports. Here is a breakdown of findings across the 26 projects we tested.
OpenPrinting CUPS
6/6 Fixed- NULL Pointer Dereference in
cupsResolveConflicts()-- commit 4e23072 - Heap buffer overflow in
cupsUTF8ToCharset-- Issue #1438 - Open redirect in OAuth login flow -- Issue #1419
- Memory leak in PPD parser (JCLBegin)
- Alloc-dealloc mismatch in fuzz_array
- API misuse in fuzz_ppd_gen_conflicts
UPX
4/4 Fixed- Heap Buffer Overflow in
PackLinuxElf64::generateElfHdr-- Issue #947 - Heap Buffer Overflow in
PeFile::processLoadConf - Memory Leak in PackLinuxElf32
- Memory Leak in PeFile Resource convert
fwupd
4/4 Fixed- Integer Underflow in sbatlevel parser -- Issue #9659
- CAB MSZIP decompression bomb
- Logitech BulkController out-of-bounds read
- Logitech RDFU stack overflow
Apache Avro
3/3 FixedApache PDFBox
3/3 FixedJSON-Java
3/3 Fixed- StringIndexOutOfBounds in XMLTokener -- Issue #1035
- ClassCastException in JSONML -- Issue #1034
- NumberFormatException in XMLTokener -- Issue #1036
BlueZ
0/5 Pending- OBEX Assertion Failure -- Issue #1721
- MGMT TLV Heap Overflow (CVE pending)
- EIR Memory Leak (CVE pending)
- SDP XML Memory Leak (CVE pending)
- OBEX NULL Pointer Dereference
OpenLDAP
2/2 Fixed- Stack Buffer Underflow in
ldif_read_record-- Bug #10431 - Heap Buffer Overflow in Schema Parser -- Bug #10430
ImageMagick
2/2 Fixed- NULL Pointer Deref in MSL comment tag -- GHSA-5vx3
- MSL stack overflow via recursive includes
Ghidra (NSA)
Confirmed- OOM via nested generics in rust_demangle -- GHSA-m94m
- OOM via malformed symbol in cplus_demangle
simdutf
1/1 Fixed- Heap buffer overflow in UTF-16 to UTF-8 conversion -- Issue #911, fixed in v7.7.2
Mongoose
2/3 Fixed- Heap overflow in
mg_mqtt_next_prop-- Issue #3419 - Heap overflow in
mg_match(2 variants)
Additional Findings
Beyond the projects above, we discovered vulnerabilities across a range of other popular open source software -- from video codecs and industrial protocols to command-line utilities and serialization libraries.
| Project | Vulnerability | Type | Severity | Status |
|---|---|---|---|---|
| OpenH264 | Heap Buffer Overflow in Scene Change Detection | Heap Buffer Overflow | High | Confirmed |
| OpenH264 | Heap Buffer Overflow in WelsDec::NeedErrorCon | Heap Buffer Overflow | High | Submitted |
| OPC-UA | Assertion Failure in PubSub JSON Decoder | Assertion Failure | Medium | Fixed |
| OPC-UA | NULL Deref in EventFilter Parser | NULL Pointer Deref | Medium | Submitted |
| Busybox | TAR symlink path traversal | Path Traversal | High | Submitted |
| Busybox | TAR hardlink target unsanitized | Path Traversal | High | Submitted |
| V2xHub | CARMACloud Bounds Stack Overflow | Stack Overflow | High | Submitted |
| V2xHub | SPAT NTCIP1202 Stack Overflow | Stack Overflow | High | Submitted |
| Curl | HTTP Negotiate/SPNEGO Connection Reuse | Auth Bypass | Medium | Fixed |
| Binutils | OOM in rust_demangle via deeply nested generics | Denial of Service | Medium | Submitted |
| libxml2 | HTML parser DoS via excessive attributes | Denial of Service | Medium | Submitted |
| Net-SNMP | NULL Deref in vacm_parse_config | NULL Pointer Deref | Medium | Submitted |
| JQ | NULL Deref in dump_operation | NULL Pointer Deref | Medium | Fixed |
| Flatbuffers | NULL Deref in GenerateBinary | NULL Pointer Deref | Medium | Submitted |
| Libmodbus | modbus_reply Short-Request OOB Read | OOB Read | Medium | Submitted |
| TCPreplay | Save-Opts OptionSaveFile SEGV | Segfault | Medium | Submitted |
Selected Case Studies
simdutf: Heap Overflow in a Performance-Critical Library
simdutf is a Unicode validation and transcoding library used as a dependency in Node.js, Bun, and many other projects. Our fuzzer found a heap buffer overflow in convert_utf16_to_utf8_safe -- a function whose name explicitly promises safety. The overflow occurred when processing specific UTF-16 sequences, allowing reads past the allocated buffer boundary.
We reported this as Issue #911. The maintainers responded promptly, merging a fix in PR #912 and shipping it in release v7.7.2. Given simdutf's position in the dependency chain of major runtimes, this fix has broad downstream impact.
OpenLDAP: Stack Buffer Underflow in LDIF Parser
OpenLDAP is one of the most widely deployed directory service implementations, used in enterprise authentication infrastructure worldwide. We discovered a stack buffer underflow in ldif_read_record that could be triggered by specially crafted LDIF input. In environments where LDIF data is processed from untrusted sources (e.g., bulk imports, replication feeds), this represents a real attack surface.
The fix was committed as cd70bf50 after we filed Bug #10431 on the OpenLDAP issue tracker. A second vulnerability -- a heap buffer overflow in the schema parser -- was fixed via Bug #10430.
Apache Avro: Decompression Bombs and Negative Lengths
Apache Avro is a data serialization framework used extensively in big data pipelines (Kafka, Spark, Hadoop). We found three vulnerabilities in its Java implementation: two cases where negative block/string sizes in the binary format caused allocation-size-too-big crashes, and a decompression bomb that could exhaust memory when processing compressed Avro containers.
All three were fixed via merged pull requests (#3622, #3623, #3625). These bugs are particularly relevant in Kafka consumers and other services that deserialize untrusted Avro data.
How We Found Them
Our approach combines AI-assisted harness generation with systematic fuzzing infrastructure. Here is the pipeline:
Target Selection
We identify high-value targets based on deployment breadth, attack surface exposure, and existing fuzzing coverage gaps. Projects with C/C++ codebases and network-facing parsers get priority.
AI-Assisted Harness Generation
Using our LLM-powered system (the same one that placed 4th at AIxCC), we generate fuzzing harnesses that exercise deep code paths. The LLM analyzes API surfaces, identifies security-sensitive functions, and writes targeted fuzz drivers.
Continuous Fuzzing
Harnesses run under libFuzzer and AddressSanitizer for extended campaigns -- typically 20+ hours per target. We track corpus growth, edge coverage, and crash deduplication across runs.
4-Principle Verification
Every crash goes through our rigorous verification process: (1) the fuzzer logic is correct, (2) it correctly calls the target API, (3) it does not cross security boundaries artificially, and (4) it uses the correct entry point. Only true positives proceed.
Responsible Disclosure
Verified vulnerabilities are reported to maintainers via GitHub issues, security advisories, or project-specific channels. Each report includes a minimal PoC, root cause analysis, and often a proposed patch.
What We Learned
Parsers Remain the Weakest Link
The majority of our findings are in parsing code -- LDIF parsers, UTF converters, PDF readers, JSON/XML processors, binary format decoders. Any code that interprets untrusted structured input is a prime fuzzing target, and most projects still have coverage gaps in these areas.
Java Is Not Immune
While Java's memory safety prevents classic buffer overflows, we found type confusion, integer overflow, and DoS vulnerabilities in Apache Avro, PDFBox, JSON-Java, and GraalJS. Memory safety does not equal input safety.
Maintainer Response Varies Widely
Some projects (Apache Avro, simdutf, CUPS) fixed bugs within days. Others (BlueZ, OpenH264) have acknowledged bugs but have not shipped fixes yet. The fastest path to a fix is a well-written report with a minimal reproducer and a proposed patch.
Patch Cycles Vary by Ecosystem
Projects with active security teams (Apache, CUPS, simdutf) ship fixes quickly, while others (Busybox, V2xHub, Libmodbus) have longer patch cycles and less fuzz testing infrastructure. Understanding each project's release cadence is key to effective disclosure.
AI Harness Generation Scales
Manually writing fuzzing harnesses for 26 projects would take weeks. Our LLM-assisted approach generated effective harnesses for diverse API surfaces -- from C socket libraries to Java stream processors -- making broad-scope campaigns practical.
False Positive Control Matters
Our 4-principle verification process kept the false positive rate below 5%. Reporting real bugs builds trust with maintainers; reporting false positives wastes their time and erodes credibility. We maintain a dedicated FP directory to track and learn from our mistakes.
Campaign Timeline
DARPA AIxCC Final
4th place finish. 28 vulnerabilities discovered in competition targets, including 6 zero-days. System proven at scale.
Initial Open Source Targets
Campaign begins with CUPS, ImageMagick, and igraph. First upstream fixes merged for CUPS NULL pointer dereference.
Expanding the Scope
Fuzzers deployed against fwupd, simdutf, dbus-broker, and PDFBox. Active fuzzing campaigns running across 8+ projects simultaneously.
Major Submission Wave
Bulk submission of 40+ vulnerabilities across Avro, PDFBox, JSON-Java, OpenLDAP, Ghidra, UPX, BlueZ, and IoT targets. Rapid fixes from Apache projects and simdutf.
Current State
62 vulnerabilities tracked, 43 confirmed, 36 fixed. New submissions for Curl, Binutils, and Libmodbus. Ongoing campaigns against additional targets with incoming reports for FreeType, ICU, and libxml2.
What's Next
This campaign is ongoing. We have incoming vulnerability reports for FreeType, Redis, ICU, and libxml2 currently in verification. We are also expanding into new target categories.
Explore Our Work
Our complete system, vulnerability management framework, and fuzzing harnesses are open source.