Siyuan Huang
23 papers · 2 citations
Most-cited papers
- An Embodied Generalist Agent In 3D World2023 · 357 citations
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models2023 · 288 citations
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment2023 · 245 citations
- Spatio-temporal Self-supervised Representation Learning For 3D Point Clouds2021 · 177 citations
- SPHINX-X: Scaling Data And Parameters For A Family Of Multi-modal Large Language Models2024 · 149 citations
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment2023 · 112 citations
- Draw-and-understand: Leveraging Visual Prompts To Enable Mllms To Comprehend What You Want2024 · 100 citations
- Gapartnet: Cross-category Domain-generalizable Object Perception And Manipulation Via Generalizable And Actionable Parts2022 · 81 citations
- Sceneverse: Scaling 3D Vision-language Learning For Grounded Scene Understanding2024 · 53 citations
- Unifying 3D Vision-language Understanding Via Promptable Queries2024 · 26 citations
- Advancing 3D Scene Understanding With Mv-scanqa Multi-view Reasoning Evaluation And Tripalign Pre-training Dataset2025 · 1 citations
- Trace3d: Consistent Segmentation Lifting Via Gaussian Instance Tracing2025 · 1 citations
- Mind The Gap: Bridging Occlusion In Gait Recognition Via Residual Gap Correction2025
- Gaussianfluent: Gaussian Simulation For Dynamic Scenes With Mixed Materials2026
- Persistent Visual Memory: Sustaining Perception For Deep Generation In Lvlms2026
Topics