Awesome Similarity Search
📄Papers🧭Topics👥Authors🔥Trending🗺️Map🏆Leaderboards📚Packs🛠️Tools📝Blogs🤖Ask AI✉️Newsletter🚀Pro
+ Add Paper

← authors · overview

Ruoxi Jia

12 papers · 1560 citations
Most-cited papers
  • Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
    2023 · 1086 citations
  • Sorry-bench: Systematically Evaluating Large Language Model Safety Refusal
    2024 · 168 citations
  • Algorithm Of Thoughts: Enhancing Exploration Of Ideas In Large Language Models
    2023 · 108 citations
  • Rigorllm: Resilient Guardrails For Large Language Models Against Undesired Content
    2024 · 76 citations
  • Practical Membership Inference Attacks Against Large-scale Multi-modal Models: A Pilot Study
    2023 · 49 citations
Topics
Safety & AlignmentModel ArchitectureEvaluationFine-TuningRAGTraining TechniquesVision-LanguageEfficiencyIn-Context Learning

Stay Updated

E-Mail Digest

Submit a paper · Privacy · Terms

© 2026 Awesome Papers.