← authors · overview

Wenxuan Wang

11 papers · 0 citations

Most-cited papers

Emotionally Numb Or Empathetic? Evaluating How Llms Feel Using Emotionbench
2023 · 78 citations
On The Resilience Of Llm-based Multi-agent Collaboration With Faulty Agents
2024 · 59 citations
Not All Countries Celebrate Thanksgiving: On The Cultural Dominance In Large Language Models
2023 · 57 citations
All Languages Matter: On The Multilingual Safety Of Large Language Models
2023 · 46 citations
Who Is Chatgpt? Benchmarking Llms' Psychological Portrayal Using Psychobench
2023 · 42 citations
A Survey On The Safety And Security Threats Of Computer-using Agents: JARVIS Or Ultron?
2026 · 1 citations
Chartm\(^3\): Benchmarking Chart Editing With Multimodal Instructions
2025
Beyond The Leaderboard: Rethinking Medical Benchmarks For Large Language Models
2026
Mmedexpert-r1: Strengthening Multimodal Medical Reasoning Via Domain-specific Adaptation And Clinical Guideline Reinforcement
2026
Inference-time Scaling Of Verification: Self-evolving Deep Research Agents Via Test-time Rubric-guided Verification
2026
Toward Personalized Llm-powered Agents: Foundations, Evaluation, And Future Directions
2026
Emu3.5: Native Multimodal Models Are World Learners
2025
Combobench: Can Llms Manipulate Physical Devices To Play Virtual Reality Games?
2025

Topics

Evaluation Safety & Alignment Benchmarks Code Agents Survey Paper Visual QA & Reasoning Vision-Language Models Browser Agents Agentic Training Techniques