180 archetype-distinct AI experts across 23 domain presets, each sampled across hundreds of independent reasonings. Each archetype carries a distinct worldview, risk tolerance, and analytical framework. Deliberating until consensus emerges — or exposing genuine uncertainty.
“When the Fed's GDP forecast missed by 4.2 points in Q4 2025, prediction markets were 96% confident in the wrong outcome. Our swarm was at 57% — uncertain because the data was genuinely uncertain.”
We ran our swarm against resolved Kalshi prediction markets. Three questions. Real outcomes. Here's what happened.
| QUESTION | OUTCOME | KALSHI | HOLODECK | VERDICT |
|---|---|---|---|---|
| CPI > 0.8% Mar 2026 | ✓ YES | 73% | 44.5% | Markets win |
| CPI > 0.9% Mar 2026 | ✗ NO | 33% | 43.6% | Markets win |
| GDP > 1.5% Q4 2025 | ✗ NO | 96% | 57.7% | 🎯 Swarm wins |
Each agent is a behavioral specification, not a persona. We define how it reasons, not what it believes.
“Synthetic Agent Swarms as Calibrated Probability Estimators: Evidence from Geopolitical and Macroeconomic Forecasting”
We demonstrate that archetype specification — not model selection — is the primary driver of swarm output variance, with a 29pp range across configurations. Swarms achieve lower Brier scores than Kalshi markets on structural break events by a 25% margin.
The full engine runs 22,000+ lines of research infrastructure across 12 prediction domains. What you see in the demo is one question. The full system runs entire market sweeps.
The public demo runs a subset of the archetype library. Enterprise gets the full engine: custom domains, private deployment, dedicated calibration. Research partnerships available.
Free tier available. Enterprise plans for teams and institutions.