Awesome Similarity Search
πŸ“„Papers🧭TopicsπŸ”₯TrendingπŸ—ΊοΈMapπŸ†LeaderboardsπŸŽ“LearnπŸ€–Ask AI
β‹―More
πŸ‘₯AuthorsπŸ“šReading PacksπŸ› οΈToolsπŸ“Blogsβœ‰οΈNewsletterπŸ”–Saved
+ Add Paper

← authors Β· overview

Yangfan He

10 papers Β· 3 citations
Most-cited papers
  • GLIMPSE: Do Large Vision-language Models Truly Think With Videos Or Just Glimpse At Them?
    2025 Β· 2 citations
  • TV-RAG: A Temporal-aware And Semantic Entropy-weighted Framework For Long Video Retrieval And Understanding
    2025 Β· 1 citations
  • Dmllm-tts: Self-verified And Efficient Test-time Scaling For Diffusion Multi-modal Large Language Models
    2025
  • DTP: A Simple Yet Effective Distracting Token Pruning Framework For Vision-language Action Models
    2026
  • Physicsmind: Sim And Real Mechanics Benchmarking For Physical Reasoning And Prediction In Foundational Vlms And World Models
    2026
Topics
Vision-Language ModelsVideo-LanguageBenchmarksVisual QA & ReasoningImage-Text RetrievalAudio-VisualEmbodied & Agents

Stay Updated

E-Mail Digest

Submit a paper Β· Privacy Β· Terms

Β© 2026 Awesome Papers.