Jiajun Wu
28 papers · 1 citations
Most-cited papers
- ULIP: Learning A Unified Representation Of Language, Images, And Point Clouds For 3D Understanding2022 · 216 citations
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding2023 · 88 citations
- Skyscript: A Large And Semantically Diverse Vision-language Dataset For Remote Sensing2023 · 78 citations
- Zeronvs: Zero-shot 360-degree View Synthesis From A Single Image2023 · 75 citations
- Diffusion Self-distillation For Zero-shot Customized Image Generation2024 · 11 citations
- Embodied Agent Interface: Benchmarking Llms For Embodied Decision Making2024 · 6 citations
- E-MAPP: Efficient Multi-agent Reinforcement Learning With Parallel Program Guidance2022 · 1 citations
- 10 Open Challenges Steering The Future Of Vision-language-action Models2025 · 1 citations
- LLMC+: Benchmarking Vision-language Model Compression With A Plug-and-play Toolkit2025
- Autoregressive Flow Matching For Motion Prediction2025
- Coupled Diffusion Sampling For Training-free Multi-view Image Editing2025
- Cap-x: A Framework For Benchmarking And Improving Coding Agents For Robot Manipulation2026
- World Model For Robot Learning: A Comprehensive Survey2026
- Close The Loop: Synthesizing Infinite Tool-use Data Via Multi-agent Role-playing2025
- Using Large Language Models For Embodied Planning Introduces Systematic Safety Risks2026
Topics