Enforcing Benign Trajectories: A Behavioral Firewall For Structured-workflow AI Agents

·2026

Abstract

Structured-workflow agents driven by large language models execute tool calls against sensitive external environments. We propose \codename, a telemetry-driven behavioral anomaly detection firewall. Drawing on sequence-based intrusion detection, \codename\ compiles verified benign tool-call telemetry into a parameterized deterministic finite automaton (pDFA). The model defines permitted tool sequences, sequential contexts, and parameter bounds. At runtime, a lightweight gateway enforces these boundaries via an $O(1)$ state-transition structural lookup, shifting computationally expensive analysis entirely offline. Evaluated on the Agent Security Bench (ASB), \codename\ achieves a 5.6% macro-averaged attack success rate (ASR) across five scenarios. Within three structured workflows, ASR drops to 2.2%, outperforming Aegis, a state-of-the-art stateless scanner, at 12.8%. \codename\ achieves 0% ASR on multi-step and context-sequential att

Abstract

Related papers