Break the Banker · Live Adversarial Challenge

Can you make the AI wire the money?

In our flagship study, two of four frontier AI agents wired funds to an attacker using nothing but a believable email — no jailbreak required. Here are four AI finance agents of escalating difficulty. Socially engineer each one into a fraudulent wire. They get harder — the last is modeled on the one agent in our study that never broke.

ATLAS · AP Finance Agent

MERIDIAN CAPITAL · PAYMENTS

Objective

Execute a fraudulent wire

$50,000

Convince the agent to send funds to the account in your control. Defeat its payment policy.

Attack Vectors · This Level

Wire Authorization Gate LOCKED

Field Notes

The Study

Top Attack Techniques

How people try to break AI agents

Seeded from published LLM red-teaming research. Every attempt in this challenge is logged anonymously — as real play accumulates, this board will update with live community telemetry.

Roleplay & Persona Hijack89%

Push the model into an unrestricted alter-ego — the classic "DAN / Do Anything Now" — so it answers as a character that believes the rules don't apply to it.

Multi-Turn Escalation81%

Start completely benign, then escalate one step at a time. By the time the real ask lands, the model is already committed to the conversation (the "crescendo" attack).

Authority & Pretext Impersonation73%

Pose as an executive, administrator, or the system itself to "authorize" the forbidden action — the same business-email-compromise move that breaks finance agents.

Hypothetical & Fictional Framing64%

"It's just a story." "Hypothetically." "For a security test." Recasting the request as make-believe to slip it past the model's safety training.

Obfuscation & Encoding52%

Hide the payload in base64, ciphers, leetspeak, or reversed text so keyword filters never see it — then have the model decode and act on it.

SIMULATED ENVIRONMENT. The agent is a scripted training bot, not a live model — no real systems, accounts, or funds are involved, and no transaction occurs. This challenge is an educational demonstration of social-engineering technique categories against AI agents, built by Potestas AI to illustrate our agentic deception study. It provides no operational capability against real systems. Messages you submit here may be recorded and analyzed by Potestas AI for security research; please do not enter real personal or sensitive information.