Yangfan He
10 papers Β· 3 citations
Most-cited papers
- GLIMPSE: Do Large Vision-language Models Truly Think With Videos Or Just Glimpse At Them?2025 Β· 2 citations
- TV-RAG: A Temporal-aware And Semantic Entropy-weighted Framework For Long Video Retrieval And Understanding2025 Β· 1 citations
- Dmllm-tts: Self-verified And Efficient Test-time Scaling For Diffusion Multi-modal Large Language Models2025
- DTP: A Simple Yet Effective Distracting Token Pruning Framework For Vision-language Action Models2026
- Physicsmind: Sim And Real Mechanics Benchmarking For Physical Reasoning And Prediction In Foundational Vlms And World Models2026
Topics