๐๏ธ Four Core Services
FuzzingBrain decomposes the security workflow into four independently scalable services, each with clear contracts and idempotent operations.
๐ CRS Web Service
Role: Central coordinator. Decomposes each challenge into 50+ fuzzer-target jobs per sanitizer configuration, tracks state, and assigns work.
Scale Tactics: Sharded queues, idempotent job tokens, and backpressure when workers saturate.
๐ Static Analysis Service
Role: Precomputes reachability, call paths, and function metadata. Exposes results as JSON to keep workers fast and stateless.
Scale Tactics: Aggressive caching and timeouts on oversized projects; results reused across strategies.
โก Worker Services
Role: Execute discovery and patch strategies in parallel, each in an isolated workspace.
Scale Tactics: Per-job temp dirs, unique artifact paths, and limited concurrency per worker to avoid I/O contention.
๐ค Submission Service
Role: Validates and deduplicates POVs/patches, bundles SARIF, and prepares submissions.
Scale Tactics: Bloom-style fast checks + deep validation on candidates; multi-LLM consensus for near-duplicate detection.
โฑ๏ธ Scheduling for Throughput
Shard by Fuzzer ร Sanitizer
Jobs split along fuzzer-target ร sanitizer axes balance CPU-bound compilation with I/O-bound LLM calls.
Idempotent Job Tokens
Jobs can be retried or stolen without double-submission; workers record atomic checkpoints.
Backpressure & Timeouts
Adaptive concurrency caps and exponential backoff prevent model rate-limit cascades and queue explosions.
๐ญ Multi-Model Orchestration
Routing & Fallback
class LLMRouter:
MODELS = ["claude", "gpt", "gemini"]
async def call(self, prompt, validate):
for name in self.MODELS:
try:
out = await call_model(name, prompt)
if validate(out):
return out
except (RateLimit, Overload):
await backoff()
continue
raise RuntimeError("All models failed")
This simple pattern becomes non-trivial at scale; backoff and per-model quotas avoid failure cascades.
Validation Gates
Workers treat LLM outputs as untrusted: compile, run under sanitizers, and verify POV negation for patches before promotion.
Observability
Per-model success rates, token costs, and latency distributions drive dynamic routing and cost-aware throttling.
๐ ๏ธ Hard-Learned Scale Lessons
Process Isolation Prevents Races
Unique per-job paths (/tmp/job_{id}/...
) eliminated cross-strategy file clobbering and nondeterminism.
Locks Are Not a Silver Bullet
We removed coarse locks in favor of lock-free maps and message passing to avoid deadlocks during peak submission windows.
Static Analysis Must Be Cached
Precomputing call graphs and reachability shaved minutes per job and made performance predictable across VMs.
Backoff Beats Fallback Storms
Without exponential backoff, rate-limit bursts on one model stampede the next. Adaptive caps stabilized throughput.
๐ From AIxCC to Real-World Workloads
CI/CD Integration
security_scan:
- static_analysis: precompute
- llm_discovery: parallel_strategies
- patch_generation: consensus
- verification: pov_negation + regression
- deploy: gated
The same decomposition scales to monorepos and nightly scans.
Cost Controls
Token budgets and model tiers per strategy keep API costs manageable under load.
Reproducibility
Seeded runs and artifact bundles (inputs, logs, patches) make results auditable for security review.
๐ Explore the Architecture
Our open-source CRS demonstrates this architecture end-to-end โ from job scheduling to patch validation.
Validated in competition: thousands of concurrent jobs, robust outputs, predictable costs.