Awesome Speech Audio
πŸ“„Papers🧭TopicsπŸ‘₯AuthorsπŸ”₯TrendingπŸ—ΊοΈMapπŸ†LeaderboardsπŸ“šPacksπŸ› οΈToolsπŸ“BlogsπŸ€–Ask AIβœ‰οΈNewsletterπŸš€Pro
+ Add Paper

← authors Β· overview

Yi Xin

11 papers Β· 0 citations
Most-cited papers
  • Dmllm-tts: Self-verified And Efficient Test-time Scaling For Diffusion Multi-modal Large Language Models
    2025
  • Unimedvl: Unifying Medical Multimodal Understanding And Generation Through Observation-knowledge-analysis
    2025
  • From Masks To Worlds: A Hitchhiker's Guide To World Models
    2025
  • Unipercept: Towards Unified Perceptual-level Image Understanding Across Aesthetics, Quality, Structure, And Texture
    2025
  • Mano Technical Report
    2025
Topics
Vision-Language ModelsBenchmarksUncategorizedImage-Text RetrievalVisual QA & ReasoningEmbodied & Agents

Stay Updated

E-Mail Digest

Submit a paper Β· Privacy Β· Terms

Β© 2026 Awesome Papers.