Peng Gao
11 papers · 4 citations
Most-cited papers
- Uniformer: Unifying Convolution And Self-attention For Visual Recognition2022 · 438 citations
- Omniquant: Omnidirectionally Calibrated Quantization For Large Language Models2023 · 385 citations
- Fast Convergence Of DETR With Spatially Modulated Co-attention2021 · 309 citations
- Tip-adapter: Training-free Adaption Of CLIP For Few-shot Classification2022 · 306 citations
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models2023 · 288 citations
- Onellm: One Framework To Align All Modalities With Language2023 · 231 citations
- Point-bind & Point-llm: Aligning Point Cloud With Multi-modality For 3D Understanding, Generation, And Instruction Following2023 · 213 citations
- Imagebind-llm: Multi-modality Instruction Tuning2023 · 174 citations
- Pointclip V2: Prompting CLIP And GPT For Powerful 3D Open-world Learning2022 · 158 citations
- Frozen CLIP Models Are Efficient Video Learners2022 · 156 citations
- Towards Adaptive Meta-gradient Adversarial Examples For Visual Tracking2025 · 4 citations
- Adaptive Markup Language Generation For Contextually-grounded Visual Document Understanding2025
- Spatial Preference Rewarding For Mllms Spatial Understanding2025
- Z-image: An Efficient Image Generation Foundation Model With Single-stream Diffusion Transformer2025
- How Do Optical Flow And Textual Prompts Collaborate To Assist In Audio-visual Semantic Segmentation?2026
Topics