BeaverTails

Emerging

1papers using it

2026first seen

The 'BeaverTails' dataset is used to evaluate the internal mechanisms of large language models (LLMs) by analyzing adversarial responses and identifying layer-wise feature vulnerabilities in adversarial settings.

🔎 Find this dataset

BeaverTails — datasets — computer-vision