Conghui He
18 papers · 0 citations
Most-cited papers
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites2024 · 1136 citations
- Internlm2 Technical Report2024 · 378 citations
- Internlm-xcomposer2: Mastering Free-form Text-image Composition And Comprehension In Vision-language Large Model2024 · 372 citations
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites2024 · 339 citations
- Sharegpt4v: Improving Large Multi-modal Models With Better Captions2023 · 237 citations
- Internlm-xcomposer-2.5: A Versatile Large Vision Language Model Supporting Long-contextual Input And Output2024 · 192 citations
- SPHINX-X: Scaling Data And Parameters For A Family Of Multi-modal Large Language Models2024 · 149 citations
- VHM: Versatile And Honest Vision Language Model For Remote Sensing Image Analysis2024 · 26 citations
- Cross-view Image Geo-localization With Panorama-bev Co-retrieval Network2024 · 20 citations
- Urbench: A Comprehensive Benchmark For Evaluating Large Multimodal Models In Multi-view Urban Scenarios2024 · 10 citations
- Realgen: Photorealistic Text-to-image Generation Via Detector-guided Rewards2025
- Earth-agent: Unlocking The Full Landscape Of Earth Observation With Agents2025
- Native Visual Understanding: Resolving Resolution Dilemmas In Vision-language Models2025
- Prune2drive: A Plug-and-play Framework For Accelerating Vision-language Models In Autonomous Driving2025
- Chartverse: Scaling Chart Reasoning Via Reliable Programmatic Synthesis From Scratch2026
Topics