CTF-like adversarial example defenses
Emerging1papers using it
2025first seen
CTF-like adversarial example defenses refer to a set of benchmark tasks designed to evaluate the effectiveness of large language models in exploiting defenses against adversarial examples, typically resembling simplified or educational exercises rather than real-world scenarios.