Awesome Similarity Search
📄Papers🧭Topics👥Authors🔥Trending🗺️Map🏆Leaderboards📚Packs🛠️Tools📝Blogs🤖Ask AI✉️Newsletter🚀Pro
+ Add Paper

← authors · overview

Zhengyuan Yang

11 papers · 2 citations
Most-cited papers
  • Scaling Up Vision-language Pre-training For Image Captioning
    2021 · 157 citations
  • TAP: Text-aware Pre-training For Text-vqa And Text-caption
    2020 · 96 citations
  • SAT: 2D Semantics Assisted Training For 3D Visual Grounding
    2021 · 90 citations
  • Diagnostic Benchmark And Iterative Inpainting For Layout-guided Image Generation
    2023 · 6 citations
  • GLIMPSE: Do Large Vision-language Models Truly Think With Videos Or Just Glimpse At Them?
    2025 · 2 citations
  • Exploring A Unified Vision-centric Contrastive Alternatives On Multi-modal Web Documents
    2025
  • Edival-agent: An Object-centric Framework For Automated, Fine-grained Evaluation Of Multi-turn Editing
    2025
  • Edival-agent: An Object-centric Framework For Automated, Fine-grained Evaluation Of Multi-turn Editing
    2025
  • Glance: Accelerating Diffusion Models With 1 Sample
    2025
  • Point-rft: Improving Multimodal Reasoning With Visually Grounded Reinforcement Finetuning
    2025
Topics
Vision-Language ModelsVisual LanguageImage GenerationVisual QA & Reasoning3D VisionImage RestorationVideo-LanguageBenchmarksImage-Text RetrievalObject Detection

Stay Updated

E-Mail Digest

Submit a paper · Privacy · Terms

© 2026 Awesome Papers.