Jing Shao
21 papers · 743 citations
Most-cited papers
- Salad-bench: A Hierarchical And Comprehensive Safety Benchmark For Large Language Models2024 · 216 citations
- Attacks, Defenses And Evaluations For LLM Conversation Safety: A Survey2024 · 147 citations
- Psysafe: A Comprehensive Framework For Psychological-based Attack, Defense, And Evaluation Of Multi-agent System Safety2024 · 82 citations
- Codeattack: Revealing Safety Generalization Challenges Of Large Language Models Via Code Completion2024 · 68 citations
- Explainable And Interpretable Multimodal Large Language Models: A Comprehensive Survey2024 · 67 citations
- Safework-r1: Coevolving Safety And Intelligence Under The Ai-45\(^{\circ}\) Law2025
- Toolsafe: Enhancing Tool Invocation Safety Of Llm-based Agents Via Proactive Step-level Guardrail And Feedback2026
- Benchmarks For Trajectory Safety Evaluation And Diagnosis In Openclaw And Codex: Atbench-claw And Atbench-codex2026
Topics