Wildjailbreak
Emerging2papers using it
6,836HF downloads
133HF likes
2025first seen
WildJailbreak Dataset Card WildJailbreak is an open-source synthetic safety-training dataset with 262K vanilla (direct harmful requests) and adversarial (complex adversarial jailbreaks) prompt-response pairs. In order to mitigate exaggerated safety behaviors, WildJailbreaks provides two contrastive types of queries: 1)
π€ Hugging Faceβ odc-by