AGENTREDBENCH

Emerging

1papers using it

2026first seen

AGENTREDBENCH is a dynamic benchmark that evaluates the effectiveness of LLM agents against 215 subtle authorization attack scenarios across 24 enterprise integrations, focusing on indirect prompt injection threats in tool-use contexts.

🔎 Find this dataset

Papers using AGENTREDBENCH (1)

AgentRedBench: Dynamic Redteaming and Integration-Aware Defense for LLM Agents over SaaS Integrations2026