Awesome Speech Audio
πŸ“„Papers🧭TopicsπŸ”₯TrendingπŸ—ΊοΈMapπŸ†LeaderboardsπŸ€–Ask AI
β‹―More
πŸ‘₯AuthorsπŸ“šReading PacksπŸ› οΈToolsπŸ“Blogsβœ‰οΈNewsletterπŸ”–Saved
+ Add Paper

← authors Β· overview

Yuxuan Wang

18 papers Β· 0 citations
Most-cited papers
  • Video-salmonn: Speech-enhanced Audio-visual Large Language Models
    2024 Β· 90 citations
  • Hawkeye: Training Video-text Llms For Grounding Text In Videos
    2024 Β· 80 citations
  • Halo: Estimation And Reduction Of Hallucinations In Open-source Weak Large Language Models
    2023 Β· 46 citations
  • Llama Rider: Spurring Large Language Models To Explore The Open World
    2023 Β· 26 citations
  • Efficient Temporal Extrapolation Of Multimodal Large Language Models With Temporal Grounding Bridge
    2024 Β· 16 citations
  • Qwen3-vl Technical Report
    2025
  • Hunyuan3d Studio: End-to-end AI Pipeline For Game-ready 3D Asset Generation
    2025
  • Omnivideobench: Towards Audio-visual Understanding Evaluation For Omni Mllms
    2025
  • Sounding That Object: Interactive Object-aware Image To Audio Generation
    2025
  • Physcodebench: Benchmarking Physics-aware Symbolic Simulation Of 3D Scenes Via Self-corrective Multi-agent Refinement
    2026
Topics
Vision-LanguageModel ArchitectureBenchmarksTraining TechniquesEvaluationSafety & AlignmentVisual QA & ReasoningVideo-LanguageAudio-VisualFine-Tuning

Stay Updated

E-Mail Digest

Submit a paper Β· Privacy Β· Terms

Β© 2026 Awesome Papers.