Awesome Speech Audio
πŸ“„Papers🧭TopicsπŸ‘₯AuthorsπŸ”₯TrendingπŸ—ΊοΈMapπŸ†LeaderboardsπŸ“šPacksπŸ› οΈToolsπŸ“BlogsπŸ€–Ask AIβœ‰οΈNewsletterπŸš€Pro
+ Add Paper

← authors Β· overview

Xinyu Wang

19 papers Β· 6 citations
Most-cited papers
  • Can Large Vision-language Models Understand Multimodal Sarcasm?
    2025 Β· 4 citations
  • Compute Only 16 Tokens In One Timestep: Accelerating Diffusion Transformers With Cluster-driven Feature Caching
    2025 Β· 1 citations
  • P2MFDS: A Privacy-preserving Multimodal Fall Detection System For Elderly People In Bathroom Environments
    2025 Β· 1 citations
  • Controllable Video Generation: A Survey
    2025
  • Qwen3 Technical Report
    2025
  • Hybridtm: Combining Transformer And Mamba For 3D Semantic Segmentation
    2025
  • Proxywar: Dynamic Assessment Of LLM Code Generation In Game Arenas
    2026
Topics
UncategorizedVideo-LanguageVision-Language ModelsCode AgentsEvaluationBenchmarksMulti-Agent

Stay Updated

E-Mail Digest

Submit a paper Β· Privacy Β· Terms

Β© 2026 Awesome Papers.