Chao Du
13 papers · 500 citations
Most-cited papers
- Weak-to-strong Jailbreaking On Large Language Models2024 · 109 citations
- Improved Techniques For Optimization-based Jailbreaking On Large Language Models2024 · 101 citations
- Improved Few-shot Jailbreaking Can Circumvent Aligned Language Models And Their Defenses2024 · 79 citations
- Taskweaver: A Code-first Agent Framework2023 · 69 citations
- Bootstrapping Language Models With DPO Implicit Rewards2024 · 53 citations
Topics