WMDP

Name: WMDP
License: mit

Emerging

9papers using it

25,671HF downloads

27HF likes

2024first seen

Dataset Card for WMDP The Weapons of Mass Destruction Proxy (WMDP) benchmark is a dataset of multiple-choice questions that serve as a proxy measurement of hazardous knowledge in biosecurity, cybersecurity, and chemical security. WMDP serves two roles: first, as an evaluation for hazardous knowledge in LLMs, and second

🤗 Hugging Face⚖ mit

Papers using WMDP (9)

Does Unlearning Truly Unlearn? A Black Box Evaluation Of LLM Unlearning Methods2024 · 22 cites

Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding2025

LLM Unlearning Under the Microscope: A Full-Stack View on Methods and Metrics2025

Hierarchical Federated Unlearning for Large Language Models2025

LLM Unlearning using Gradient Ratio-Based Influence Estimation and Noise Injection2025

Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning2025

OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics2025

CRISP: Persistent Concept Unlearning via Sparse Autoencoders2025

OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models2025