Wenqi Shao
13 papers Β· 5 citations
Most-cited papers
- Omniquant: Omnidirectionally Calibrated Quantization For Large Language Models2023 Β· 385 citations
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models2023 Β· 288 citations
- Imagebind-llm: Multi-modality Instruction Tuning2023 Β· 174 citations
- SPHINX-X: Scaling Data And Parameters For A Family Of Multi-modal Large Language Models2024 Β· 149 citations
- Lumina-t2x: Transforming Text Into Any Modality, Resolution, And Duration Via Flow-based Large Diffusion Transformers2024 Β· 137 citations
- Diffagent: Fast And Accurate Text-to-image API Selection With Large Language Model2024 Β· 5 citations
- Flow-anything: Learning Real-world Optical Flow Estimation From Large-scale Single-view Images2025 Β· 5 citations
- Owmm-agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis2025
- Unipruning: Unifying Local Metric And Global Feedback For Scalable Sparse Llms2025
- Internspatial: A Comprehensive Dataset For Spatial Reasoning In Vision-language Models2025
- COSMO-RL: Towards Trustworthy Lmrms Via Joint Safety And Stability2025
- Vtperception-r1: Enhancing Multimodal Reasoning Via Explicit Visual And Textual Perceptual Grounding2025
- Unifork: Exploring Modality Alignment For Unified Multimodal Understanding And Generation2025
- Samrefiner: Taming Segment Anything Model For Universal Mask Refinement2025
Topics