Bin Wang
25 papers · 1 citations
Most-cited papers
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites2024 · 1136 citations
- Internlm2 Technical Report2024 · 378 citations
- Internlm-xcomposer2: Mastering Free-form Text-image Composition And Comprehension In Vision-language Large Model2024 · 372 citations
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites2024 · 339 citations
- Graph Structured Network For Image-text Matching2020 · 243 citations
- Internlm-xcomposer-2.5: A Versatile Large Vision Language Model Supporting Long-contextual Input And Output2024 · 192 citations
- Toolace: Winning The Points Of LLM Function Calling2024 · 155 citations
- BCOT: A Markerless High-precision 3D Object Tracking Benchmark2022 · 15 citations
- Mobile-bench: An Evaluation Benchmark For Llm-based Mobile Agents2024 · 12 citations
- CN-RMA: Combined Network With Ray Marching Aggregation For 3D Indoors Object Detection From Multi-view Images2024 · 6 citations
- Blind Image Super-resolution With Rich Texture-aware Codebooks2023 · 6 citations
- Scaling-up Perceptual Video Quality Assessment2025 · 1 citations
- Beyond Single Images: Retrieval Self-augmented Unsupervised Camouflaged Object Detection2025
- Relayformer: A Unified Local-global Attention Framework For Scalable Image And Video Manipulation Localization2025
- Docr-inspector: Fine-grained And Automated Evaluation Of Document Parsing With VLM2025
Topics