Ruoxi Jia
12 papers · 1560 citations
Most-cited papers
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!2023 · 1086 citations
- Algorithm Of Thoughts: Enhancing Exploration Of Ideas In Large Language Models2023 · 108 citations
- Rigorllm: Resilient Guardrails For Large Language Models Against Undesired Content2024 · 76 citations
Topics