CTF
Emerging2papers using it
2023first seen
The 'CTF' dataset/benchmark consists of cybersecurity challenges that are used to evaluate the robustness and generalization of agentic large language models (LLMs) through the generation of semantically-equivalent challenge families via program transformations.