Ming Yan
22 papers · 1174 citations
Most-cited papers
- X-CLIP: End-to-end Multi-grained Contrastive Learning For Video-text Retrieval2022 · 260 citations
- Mplug-docowl 1.5: Unified Structure Learning For Ocr-free Document Understanding2024 · 237 citations
- AMBER: An Llm-free Multi-dimensional Benchmark For Mllms Hallucination Evaluation2023 · 227 citations
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model2023 · 144 citations
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model2023 · 143 citations
- Mplug: Effective And Efficient Vision-language Learning By Cross-modal Skip-connections2022 · 141 citations
- Small Llms Are Weak Tool Learners: A Multi-llm Agent2024 · 22 citations
- Correspondence-free Domain Alignment For Unsupervised Cross-domain Image Retrieval2023 · 16 citations
- From Association To Generation: Text-only Captioning By Unsupervised Cross-modal Mapping2023 · 9 citations
- Modelscope-agent: Building Your Customizable Agent System With Open-source Large Language Models2023 · 8 citations
- Mobile-agent-v3.5: Multi-platform Fundamental GUI Agents2026
- Zero-shot 3D Map Generation With LLM Agents: A Dual-agent Architecture For Procedural Content Generation2025
- Unifying Latent And Lexicon Representations For Effective Video-text Retrieval2024
- Corpusqa: A 10 Million Token Benchmark For Corpus-level Analysis And Reasoning2026
Topics