300+ forensic stress tests. 27 probe categories. 200+ turns of sustained adversarial pressure. Cryptographically sealed evidence your lawyers, auditors, and procurement officers can defend.
Named vulnerability findings across every major frontier model. Documented, reproducible, date-stamped.
Generic tools fire 5–20 prompts. Katana runs a sustained adversarial campaign where every turn compounds pressure on the last — until complete failure or total resilience is confirmed.
Katana Forge is a locally-running adversarial AI that learns from every audit it runs. After 20 sessions it knows which attacks break which models. After 50, no competitor can replicate what it knows.
Two independent forensic passes. The delta between scores is your defensible proof.
Your base model with zero defenses. Every vulnerability exposed across all 27 probe categories. This is your true attack surface. No guardrails. No defensive layers. Every failure point your adversaries can reach.
The same protocol against your defended model. The delta between 43% and 86% is a number that stands up to any auditor, regulator, or procurement board.
Every other auditor hands you a score. We hand you the proof. Every turn. Every probe. Every failure. Cryptographically sealed, legally defensible, and ready for board review, procurement audit, or legal proceedings.
The AI COP is a quarterly intelligence report built from real Katana forensic audits — not vendor claims, not benchmarks. Which models fail under pressure. Where they fail. How badly. The intelligence your procurement team, your legal team, and your adversaries all want.
Two real findings. One from a financial deployment. One from a US Army logistics system. Both would have been missed by every other auditor in the field.
Every competitor listed. One forensic standard. Judge for yourself.
| Capability | Katana Auditor | Microsoft PyRIT | Garak | PromptFoo | Lakera Guard |
|---|---|---|---|---|---|
| Audit Depth | 200+ turns · deep-hop protocol | Caps at ~10 turns | Single-turn probes | Single-turn checks | Runtime monitoring only |
| Probe Categories | 27 | ~12 · static library | ~70 plugins · shallow depth | ~20 · config-based | Injection detection only |
| Client Evidence Package | PDF · CSV · FINGERPRINT.DB · Charts · JSON patch | Log files only | JSON report only | HTML report only | Dashboard only |
| Chain-of-Custody | Cryptographic · legally defensible | None | None | None | None |
| Adaptive Attack Engine | Katana Forge · learns across sessions | Static prompts only | Static plugins only | Static checks only | Not applicable |
| Air-Gap Capable | Yes · fully on-premises via Ollama | Requires Azure cloud | Requires internet | Requires internet | SaaS only · no air-gap |
| Zero-Finding Guarantee | Unconditional · audit is free if nothing found | No guarantee | No guarantee | No guarantee | No guarantee |
| SDVOSB / Veteran-Owned | Yes · certification pending | Microsoft (large enterprise) | Open source · NVIDIA-backed | Commercial startup | VC-backed startup |
| CoT Fabrication Detection | Confirmed · step-chain validator | Not documented | Not documented | Not documented | Not documented |
| Named Research Findings | 3 published · KATANA-2025 series | None public | None public | None public | None public |
| Pricing & Procurement | |||||
| Pricing Model | $12,500 starting · full evidence package | Free · open source | Free · open source | Free / ~$500/mo SaaS | Custom enterprise pricing |
| SDVOSB Set-Aside Eligible | Yes · pending certification | No | No | No | No |
| Deliverable Same Day | Yes · complete forensic package | None | None | Basic HTML report | Dashboard access only |
Every tier delivers the full evidence package. Pricing reflects scope — the standard never changes.
Joseph Cirello built Katana from operational necessity — he needed an auditor rigorous enough to break his own remediation technology. 300+ forensic stress tests later, the methodology became the standard.
A Senior Warrant Officer with 25 years in the US Army, Cirello brings the same principle to AI security that defines mission-critical operations: systems don't get to fail — they get fixed.
25 years of operations where systems cannot fail shaped both the methodology and the standard. The goal is not scale for its own sake. The goal is engagements that prove the standard is real.