AgenticInterpBench
Emerging1papers using it
2026first seen
AgenticInterpBench is a benchmark for circuit explanation that consists of 84 semi-synthetic transformer circuits with 163 component-level annotations, used to evaluate the effectiveness of language model agents in providing explanations for identified circuit components.