Awesome Speech Audio
📄Papers🧭Topics👥Authors🔥Trending🗺️Map🏆Leaderboards📚Packs🛠️Tools📝Blogs🤖Ask AI✉️Newsletter🚀Pro
+ Add Paper

← authors · overview

Kaipeng Zhang

17 papers · 0 citations
Most-cited papers
  • Omniquant: Omnidirectionally Calibrated Quantization For Large Language Models
    2023 · 385 citations
  • Onellm: One Framework To Align All Modalities With Language
    2023 · 231 citations
  • Imagebind-llm: Multi-modality Instruction Tuning
    2023 · 174 citations
  • SPHINX-X: Scaling Data And Parameters For A Family Of Multi-modal Large Language Models
    2024 · 149 citations
  • Lumina-next: Making Lumina-t2x Stronger And Faster With Next-dit
    2024 · 123 citations
  • Onellm: One Framework To Align All Modalities With Language
    2023 · 79 citations
  • Diffagent: Fast And Accurate Text-to-image API Selection With Large Language Model
    2024 · 5 citations
  • Yume: An Interactive World Generation Model
    2025
  • Sridbench: Benchmark Of Scientific Research Illustration Drawing Of Image Generation Model
    2025
  • Symbolic Graphics Programming With Large Language Models
    2025
  • Internspatial: A Comprehensive Dataset For Spatial Reasoning In Vision-language Models
    2025
  • Tir-bench: A Comprehensive Benchmark For Agentic Thinking-with-images Reasoning
    2025
  • Samrefiner: Taming Segment Anything Model For Universal Mask Refinement
    2025
  • Focal Guidance: Unlocking Controllability From Semantic-weak Layers In Video Diffusion Models
    2026
Topics
Training TechniquesVision-Language ModelsModel ArchitectureEfficiencyVision-LanguageVisual LanguageBenchmarksCodeVideo-LanguageFine-Tuning

Stay Updated

E-Mail Digest

Submit a paper · Privacy · Terms

© 2026 Awesome Papers.