Awesome Speech Audio
πŸ“„Papers🧭TopicsπŸ”₯TrendingπŸ—ΊοΈMapπŸ†LeaderboardsπŸ€–Ask AI
β‹―More
πŸ‘₯AuthorsπŸ“šReading PacksπŸ› οΈToolsπŸ“Blogsβœ‰οΈNewsletterπŸ”–Saved
+ Add Paper

← authors Β· overview

Zhongyuan Wang

14 papers Β· 0 citations
Most-cited papers
  • Multi-scale Progressive Fusion Network For Single Image Deraining
    2020 Β· 679 citations
  • Hit: Hierarchical Transformer With Momentum Contrast For Video-text Retrieval
    2021 Β· 134 citations
  • Degrade Is Upgrade: Learning Degradation For Low-light Image Enhancement
    2021 Β· 44 citations
  • Towards Practical Capture Of High-fidelity Relightable Avatars
    2023 Β· 24 citations
  • Rap: Redundancy-aware Video-language Pre-training For Text-video Retrieval
    2022 Β· 4 citations
  • Omnigen2: Towards Instruction-aligned Multimodal Generation
    2025
  • Omnigen2: Towards Instruction-aligned Multimodal Generation
    2025
  • Emu3.5: Native Multimodal Models Are World Learners
    2025
  • Ac-dit: Adaptive Coordination Diffusion Transformer For Mobile Manipulation
    2025
  • Mathsticks: A Benchmark For Visual Symbolic Compositional Reasoning With Matchstick Puzzles
    2025
  • End-to-end Training Of Multimodal Model And Ranking Model
    2024
  • Tiger: Tool-integrated Geometric Reasoning In Vision-language Models For Robotics
    2025
  • Sapave: Towards Active Perception And Manipulation In Vision-language-action Models For Robotics
    2026
  • Tokenflow: Rethinking Fine-grained Cross-modal Alignment In Vision-language Retrieval
    2022
Topics
Vision-Language ModelsImage GenerationImage RestorationImage Retrieval3D VisionBenchmarksVisual LanguageUncategorizedVideo-LanguageEmbodied & Agents

Stay Updated

E-Mail Digest

Submit a paper Β· Privacy Β· Terms

Β© 2026 Awesome Papers.