Wenxuan Wang
11 papers Β· 0 citations
Most-cited papers
- Emotionally Numb Or Empathetic? Evaluating How Llms Feel Using Emotionbench2023 Β· 78 citations
- On The Resilience Of Llm-based Multi-agent Collaboration With Faulty Agents2024 Β· 59 citations
- Not All Countries Celebrate Thanksgiving: On The Cultural Dominance In Large Language Models2023 Β· 57 citations
- All Languages Matter: On The Multilingual Safety Of Large Language Models2023 Β· 46 citations
- Who Is Chatgpt? Benchmarking Llms' Psychological Portrayal Using Psychobench2023 Β· 42 citations
- A Survey On The Safety And Security Threats Of Computer-using Agents: JARVIS Or Ultron?2026 Β· 1 citations
- Chartm\(^3\): Benchmarking Chart Editing With Multimodal Instructions2025
- Beyond The Leaderboard: Rethinking Medical Benchmarks For Large Language Models2026
- Mmedexpert-r1: Strengthening Multimodal Medical Reasoning Via Domain-specific Adaptation And Clinical Guideline Reinforcement2026
- Inference-time Scaling Of Verification: Self-evolving Deep Research Agents Via Test-time Rubric-guided Verification2026
- Toward Personalized Llm-powered Agents: Foundations, Evaluation, And Future Directions2026
- Emu3.5: Native Multimodal Models Are World Learners2025
- Combobench: Can Llms Manipulate Physical Devices To Play Virtual Reality Games?2025
Topics