← all datasets

AgenticInterpBench

Emerging
1papers using it
2026first seen

AgenticInterpBench is a benchmark for circuit explanation that consists of 84 semi-synthetic transformer circuits with 163 component-level annotations, used to evaluate the effectiveness of language model agents in providing explanations for identified circuit components.

AgenticInterpBench β€” datasets β€” time-series