Yixuan Li
10 papers Β· 7 citations
Most-cited papers
- Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning For Vision Language Models2024 Β· 138 citations
- Picle: Eliciting Diverse Behaviors From Large Language Models With Persona In-context Learning2024 Β· 31 citations
- Autodroid-v2: Boosting Slm-based GUI Agents Via Code Generation2024 Β· 26 citations
- Intercontrol: Zero-shot Human Interaction Generation By Controlling Every Joint2023 Β· 21 citations
- Vquala 2025 Challenge On Visual Quality Comparison For Large Multimodal Models: Methods And Results2025 Β· 7 citations
- Holocine: Holistic Generation Of Cinematic Multi-shot Long Video Narratives2025
- Qdepth-vla: Quantized Depth Prediction As Auxiliary Supervision For Vision-language-action Models2025
- LSVOS 2025 Challenge Report: Recent Advances In Complex Video Object Segmentation2025
- Magicquillv2: Precise And Interactive Image Editing With Layered Visual Cues2025
Topics