Wenxuan Wang
11 papers · 0 citations
Most-cited papers
- Emotionally Numb Or Empathetic? Evaluating How Llms Feel Using Emotionbench2023 · 78 citations
- On The Resilience Of Llm-based Multi-agent Collaboration With Faulty Agents2024 · 59 citations
- Not All Countries Celebrate Thanksgiving: On The Cultural Dominance In Large Language Models2023 · 57 citations
- All Languages Matter: On The Multilingual Safety Of Large Language Models2023 · 46 citations
- Who Is Chatgpt? Benchmarking Llms' Psychological Portrayal Using Psychobench2023 · 42 citations
- A Survey On The Safety And Security Threats Of Computer-using Agents: JARVIS Or Ultron?2026 · 1 citations
- Emu3.5: Native Multimodal Models Are World Learners2025
- Chartm\(^3\): Benchmarking Chart Editing With Multimodal Instructions2025
- Beyond The Leaderboard: Rethinking Medical Benchmarks For Large Language Models2026
- Mmedexpert-r1: Strengthening Multimodal Medical Reasoning Via Domain-specific Adaptation And Clinical Guideline Reinforcement2026
- Combobench: Can Llms Manipulate Physical Devices To Play Virtual Reality Games?2025
- Inference-time Scaling Of Verification: Self-evolving Deep Research Agents Via Test-time Rubric-guided Verification2026
- Identifying The Achilles' Heel: An Iterative Method For Dynamically Uncovering Factual Errors In Large Language Models2026
- Toward Personalized Llm-powered Agents: Foundations, Evaluation, And Future Directions2026
Topics