Hao Fei
14 papers · 5 citations
Most-cited papers
- Next-gpt: Any-to-any Multimodal LLM2023 · 780 citations
- LL3DA: Visual Interactive Instruction Tuning For Omni-3d Understanding, Reasoning, And Planning2023 · 216 citations
- Faithful Logical Reasoning Via Symbolic Chain-of-thought2024 · 156 citations
- Omg-llava: Bridging Image-level, Object-level, Pixel-level Reasoning And Understanding2024 · 150 citations
- Layoutllm-t2i: Eliciting Layout Guidance From LLM For Text-to-image Generation2023 · 141 citations
- Vitcot: Video-text Interleaved Chain-of-thought For Boosting Video Understanding In Large Language Models2025 · 4 citations
- Leaf-mamba: Local Emphatic And Adaptive Fusion State Space Model For RGB-D Salient Object Detection2025 · 1 citations
- MCM-DPO: Multifaceted Cross-modal Direct Preference Optimization For Alt-text Generation2025
- Visual Thoughts: A Unified Perspective Of Understanding Multimodal Chain-of-thought2025
- Samtok: Representing Any Mask With Two Words2026
Topics