Lewei Lu
10 papers · 0 citations
Most-cited papers
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites2024 · 339 citations
- Fuseformer: Fusing Fine-grained Information In Transformers For Video Inpainting2021 · 143 citations
- Visionllm V2: An End-to-end Generalist Multimodal Large Language Model For Hundreds Of Vision-language Tasks2024 · 5 citations
- Synergen-vl: Towards Synergistic Image Understanding And Generation With Vision Experts And Token Folding2024 · 4 citations
- PVC: Progressive Visual Token Compression For Unified Image And Video Processing In Large Vision-language Models2024 · 2 citations
- From Pixels To Words -- Towards Native Vision-language Primitives At Scale2025
- Streamline Without Sacrifice -- Squeeze Out Computation Redundancy In LMM2025
- Spatial Preference Rewarding For Mllms Spatial Understanding2025
- Scaling Spatial Intelligence With Multimodal Foundation Models2025
- Sensenova-mars: Empowering Multimodal Agentic Reasoning And Search Via Reinforcement Learning2025
Topics