Yu Qiao
44 papers · 1 citations
Most-cited papers
- Internvl: Scaling Up Vision Foundation Models And Aligning For Generic Visual-linguistic Tasks2023 · 2715 citations
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites2024 · 1136 citations
- Activating More Pixels In Image Super-resolution Transformer2022 · 916 citations
- Are We On The Right Way For Evaluating Large Vision-language Models?2024 · 736 citations
- Detecting Text In Natural Image With Connectionist Text Proposal Network2016 · 674 citations
- Uniformer: Unifying Convolution And Self-attention For Visual Recognition2022 · 438 citations
- Point Transformer V3: Simpler, Faster, Stronger2023 · 438 citations
- Omniquant: Omnidirectionally Calibrated Quantization For Large Language Models2023 · 385 citations
- Internlm2 Technical Report2024 · 378 citations
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites2024 · 339 citations
- Os-genesis: Automating GUI Agent Trajectory Construction Via Reverse Task Synthesis2024 · 4 citations
- An Empirical Study Of Federated Prompt Learning For Vision Language Model2025 · 1 citations
- Scalecua: Scaling Open-source Computer Use Agents With Cross-platform Data2025
- Yume: An Interactive World Generation Model2025
- Zerogui: Automating Online GUI Learning At Zero Human Cost2025
Topics