← all datasets

Wildjailbreak

Emerging
2papers using it
6,836HF downloads
133HF likes
2025first seen

WildJailbreak Dataset Card WildJailbreak is an open-source synthetic safety-training dataset with 262K vanilla (direct harmful requests) and adversarial (complex adversarial jailbreaks) prompt-response pairs. In order to mitigate exaggerated safety behaviors, WildJailbreaks provides two contrastive types of queries: 1)

Papers using Wildjailbreak (2)

Wildjailbreak β€” datasets β€” llm-papers