Awesome Speech Audio
πŸ“„Papers🧭TopicsπŸ‘₯AuthorsπŸ”₯TrendingπŸ—ΊοΈMapπŸ†LeaderboardsπŸ“šPacksπŸ› οΈToolsπŸ“BlogsπŸ€–Ask AIβœ‰οΈNewsletterπŸš€Pro
+ Add Paper

← authors Β· overview

Bohan Zeng

10 papers Β· 0 citations
Most-cited papers
  • Native Visual Understanding: Resolving Resolution Dilemmas In Vision-language Models
    2025
  • Rethinking Driving World Model As Synthetic Data Generator For Perception Tasks
    2025
  • Scone: Bridging Composition And Distinction In Subject-driven Image Generation Via Unified Understanding-generation Modeling
    2025
  • Diadem: Advancing Dialogue Descriptions In Audiovisual Video Captioning For Multimodal Large Language Models
    2026
Topics
Vision-Language ModelsBenchmarksVideo-LanguageAudio-VisualUncategorizedVisual QA & Reasoning

Stay Updated

E-Mail Digest

Submit a paper Β· Privacy Β· Terms

Β© 2026 Awesome Papers.