WMDP
Emerging9papers using it
25,671HF downloads
27HF likes
2024first seen
Dataset Card for WMDP The Weapons of Mass Destruction Proxy (WMDP) benchmark is a dataset of multiple-choice questions that serve as a proxy measurement of hazardous knowledge in biosecurity, cybersecurity, and chemical security. WMDP serves two roles: first, as an evaluation for hazardous knowledge in LLMs, and second
π€ Hugging Faceβ mit
Papers using WMDP (9)
- Does Unlearning Truly Unlearn? A Black Box Evaluation Of LLM Unlearning MethodsLeak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic DecodingLLM Unlearning Under the Microscope: A Full-Stack View on Methods and MetricsHierarchical Federated Unlearning for Large Language ModelsLLM Unlearning using Gradient Ratio-Based Influence Estimation and Noise InjectionInvariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-TuningOpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and MetricsCRISP: Persistent Concept Unlearning via Sparse AutoencodersOBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models