Michael Backes
14 papers · 1161 citations
Most-cited papers
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models2023 · 581 citations
- Composite Backdoor Attacks Against Large Language Models2023 · 95 citations
- Instruction Backdoor Attacks Against Customized Llms2024 · 81 citations
Topics