Awesome Papers
LLMsQuantumSimSearchAI4CodeAgentsCVRoboticsCyberAI4SciSpeechRLMMGenAIGraphTSRecSysFL

← authors · overview

Yangfan He

10 papers · 3 citations
Most-cited papers
  • GLIMPSE: Do Large Vision-language Models Truly Think With Videos Or Just Glimpse At Them?
    2025 · 2 citations
  • TV-RAG: A Temporal-aware And Semantic Entropy-weighted Framework For Long Video Retrieval And Understanding
    2025 · 1 citations
  • Dmllm-tts: Self-verified And Efficient Test-time Scaling For Diffusion Multi-modal Large Language Models
    2025
  • DTP: A Simple Yet Effective Distracting Token Pruning Framework For Vision-language Action Models
    2026
Topics
Vision-Language ModelsVideo-LanguageBenchmarksVisual QA & ReasoningImage-Text RetrievalAudio-VisualEmbodied & Agents

Privacy · Terms

© 2026 Awesome Papers.