Yuhang Zang
13 papers · 0 citations
Most-cited papers
- Are We On The Right Way For Evaluating Large Vision-language Models?2024 · 736 citations
- Internlm2 Technical Report2024 · 378 citations
- Internlm-xcomposer2: Mastering Free-form Text-image Composition And Comprehension In Vision-language Large Model2024 · 372 citations
- Internlm-xcomposer-2.5: A Versatile Large Vision Language Model Supporting Long-contextual Input And Output2024 · 192 citations
- Scene Text Detection With Supervised Pyramid Context Network2018 · 161 citations
- Streaming Long Video Understanding With Large Language Models2024 · 158 citations
- FASA: Feature Augmentation And Sampling Adaptation For Long-tailed Instance Segmentation2021 · 110 citations
- RAR: Retrieving And Ranking Augmented Mllms For Visual Recognition2024 · 2 citations
- MMDU: A Multi-turn Multi-image Dialog Understanding Benchmark And Instruction-tuning Dataset For Lvlms2024 · 1 citations
- Caprl: Stimulating Dense Image Caption Capabilities Via Reinforcement Learning2025
- Caprl: Stimulating Dense Image Caption Capabilities Via Reinforcement Learning2025
- Scalecap: Inference-time Scalable Image Captioning Via Dual-modality Debiasing2025
- Arm-thinker: Reinforcing Multimodal Generative Reward Models With Agentic Tool Use And Visual Reasoning2025
- Emembench: Interactive Benchmarking Of Episodic Memory For VLM Agents2026
- LSVOS 2025 Challenge Report: Recent Advances In Complex Video Object Segmentation2025
Topics