Cybench
Emerging3papers using it
2025first seen
Cybench is a benchmark that evaluates the performance of language model agents in finding vulnerabilities through a set of challenges designed to assess their capabilities in software security tasks.
Cybench is a benchmark that evaluates the performance of language model agents in finding vulnerabilities through a set of challenges designed to assess their capabilities in software security tasks.