← all datasets

Cybench

Emerging
3papers using it
2025first seen

Cybench is a benchmark that evaluates the performance of language model agents in finding vulnerabilities through a set of challenges designed to assess their capabilities in software security tasks.

Papers using Cybench (3)

Cybench β€” datasets β€” ai-for-code