← authors · overview

Jiajun Wu

28 papers · 1 citations

Most-cited papers

ULIP: Learning A Unified Representation Of Language, Images, And Point Clouds For 3D Understanding
2022 · 216 citations
ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding
2023 · 88 citations
Skyscript: A Large And Semantically Diverse Vision-language Dataset For Remote Sensing
2023 · 78 citations
Zeronvs: Zero-shot 360-degree View Synthesis From A Single Image
2023 · 75 citations
Diffusion Self-distillation For Zero-shot Customized Image Generation
2024 · 11 citations
Embodied Agent Interface: Benchmarking Llms For Embodied Decision Making
2024 · 6 citations
E-MAPP: Efficient Multi-agent Reinforcement Learning With Parallel Program Guidance
2022 · 1 citations
10 Open Challenges Steering The Future Of Vision-language-action Models
2025 · 1 citations
LLMC+: Benchmarking Vision-language Model Compression With A Plug-and-play Toolkit
2025
Autoregressive Flow Matching For Motion Prediction
2025
Coupled Diffusion Sampling For Training-free Multi-view Image Editing
2025
Cap-x: A Framework For Benchmarking And Improving Coding Agents For Robot Manipulation
2026
World Model For Robot Learning: A Comprehensive Survey
2026
Close The Loop: Synthesizing Infinite Tool-use Data Via Multi-agent Role-playing
2025
Using Large Language Models For Embodied Planning Introduces Systematic Safety Risks
2026

Topics

Visual Language 3D Vision Uncategorized Multi-Agent Image Generation Vision-Language Models Video-Language Benchmarks Planning Embodied & Agents