Zirui Wang
11 papers Β· 0 citations
Most-cited papers
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models2022 Β· 2373 citations
- Ferret: Refer And Ground Anything Anywhere At Any Granularity2023 Β· 503 citations
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training2024 Β· 261 citations
- MM1.5: Methods, Analysis & Insights From Multimodal LLM Fine-tuning2024 Β· 70 citations
- Tokencompose: Text-to-image Diffusion With Token-level Supervision2023 Β· 39 citations
- MMAU: A Holistic Benchmark Of Agent Capabilities Across Diverse Domains2024
- Smac-hard: Enabling Mixed Opponent Strategy Script And Self-play On SMAC2024
- Veattack: Downstream-agnostic Vision Encoder Attack Against Large Vision Language Models2025
- Cue3d: Quantifying The Role Of Image Cues In Single-image 3D Generation2025
- MANZANO: A Simple And Scalable Unified Multimodal Model With A Hybrid Vision Tokenizer2025
- Openvision 2: A Family Of Generative Pretrained Visual Encoders For Multimodal Learning2025
- Mcpmark: A Benchmark For Stress-testing Realistic And Comprehensive MCP Use2025
Topics