Awesome Speech Audio
πŸ“„Papers🧭TopicsπŸ”₯TrendingπŸ—ΊοΈMapπŸ†LeaderboardsπŸ€–Ask AI
β‹―More
πŸ‘₯AuthorsπŸ“šReading PacksπŸ› οΈToolsπŸ“Blogsβœ‰οΈNewsletterπŸ”–Saved
+ Add Paper

← authors Β· overview

Xing Sun

20 papers Β· 1647 citations
Most-cited papers
  • Video-mme: The First-ever Comprehensive Evaluation Benchmark Of Multi-modal Llms In Video Analysis
    2024 Β· 1125 citations
  • MAC-SQL: A Multi-agent Collaborative Framework For Text-to-sql
    2023 Β· 192 citations
  • Memochat: Tuning Llms To Use Memos For Consistent Long-range Open-domain Conversation
    2023 Β· 72 citations
  • Cantor: Inspiring Multimodal Chain-of-thought Of MLLM
    2024 Β· 58 citations
  • Eliminating Biased Length Reliance Of Direct Preference Optimization Via Down-sampled KL Divergence
    2024 Β· 29 citations
  • Ask&confirm: Active Detail Enriching For Cross-modal Retrieval With Partial Query
    2021 Β· 15 citations
  • Coarse-to-fine: Learning Compact Discriminative Representation For Single-stage Image Retrieval
    2023 Β· 6 citations
  • Smartsnap: Proactive Evidence Seeking For Self-verifying Agents
    2025
  • Process-level Trajectory Evaluation For Environment Configuration In Software Engineering Agents
    2025
  • Aptbench: Benchmarking Agentic Potential Of Base Llms During Pre-training
    2025
  • Youtu-agent: Scaling Agent Productivity With Automated Generation And Hybrid Policy Optimization
    2025
  • Flexireid: Adaptive Mixture Of Expert For Multi-modal Person Re-identification
    2025
  • Devil's In The Details: Aligning Visual Clues For Conditional Embedding In Person Re-identification
    2020
Topics
Model ArchitectureEvaluationVision-LanguageIn-Context LearningPromptingTraining TechniquesSafety & AlignmentImage RetrievalCross-Modal HashingCode Agents

Stay Updated

E-Mail Digest

Submit a paper Β· Privacy Β· Terms

Β© 2026 Awesome Papers.