Zhe Chen
12 papers · 3 citations
Most-cited papers
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context2024 · 3466 citations
- Internvl: Scaling Up Vision Foundation Models And Aligning For Generic Visual-linguistic Tasks2023 · 2715 citations
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites2024 · 1136 citations
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites2024 · 339 citations
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities2023 · 179 citations
- Contrastive Boundary Learning For Point Cloud Segmentation2022 · 177 citations
- Visionllm V2: An End-to-end Generalist Multimodal Large Language Model For Hundreds Of Vision-language Tasks2024 · 149 citations
- Online Guidance Graph Optimization For Lifelong Multi-agent Path Finding2024 · 7 citations
- Visionllm V2: An End-to-end Generalist Multimodal Large Language Model For Hundreds Of Vision-language Tasks2024 · 5 citations
- Block Shuffle: A Method For High-resolution Fast Style Transfer With Limited Memory2020 · 4 citations
- Whu-stree: A Multi-modal Benchmark Dataset For Street Tree Inventory2025 · 3 citations
- PVC: Progressive Visual Token Compression For Unified Image And Video Processing In Large Vision-language Models2024 · 2 citations
- Mmbench-gui: Hierarchical Multi-platform Evaluation Framework For GUI Agents2025
- Advancing MAPF Toward The Real World: A Scalable Multi-agent Realistic Testbed (SMART)2025
- RAD: Towards Trustworthy Retrieval-augmented Multi-modal Clinical Diagnosis2025
Topics