← authors · overview

Dawn Song

14 papers · 8928 citations

Most-cited papers

Measuring Massive Multitask Language Understanding
2020 · 7766 citations
Representation Engineering: A Top-down Approach To AI Transparency
2023 · 899 citations
Rigorllm: Resilient Guardrails For Large Language Models Against Undesired Content
2024 · 76 citations
Guardagent: Safeguard LLM Agents By A Guard Agent Via Knowledge-enabled Reasoning
2024 · 73 citations
Decoding Compressed Trust: Scrutinizing The Trustworthiness Of Efficient Llms Under Compression
2024 · 54 citations
VERINA: Benchmarking Verifiable Code Generation
2025
Opensage: Self-programming Agent Generation Engine
2026
CUBE: A Standard For Unifying Agent Benchmarks
2026
Devops-gym: Benchmarking AI Agents In Software Devops Cycle
2026

Topics

Safety & Alignment Evaluation Benchmarks Model Architecture Code Agents In-Context Learning Survey Paper Efficiency Training Techniques Agentic