Unigraph2: Learning A Unified Embedding Space To Bind Multimodal Graphs
2025 Β· Yufei He, Yuan Sui, Xiaoxin He, et al.
Abstract
Existing foundation models, such as CLIP, aim to learn a unified embedding space for multimodal data, enabling a wide range of downstream web-based applications like search, recommendation, and content classification. However, these models often overlook the inherent graph structures in multimodal datasets, where entities and their relationships are crucial. Multimodal graphs (MMGs) represent such graphs where each node is associated with features from different modalities, while the edges capture the relationships between these entities. On the other hand, existing graph foundation models primarily focus on text-attributed graphs (TAGs) and are not designed to handle the complexities of MMGs. To address these limitations, we propose UniGraph2, a novel cross-domain graph foundation model that enables general representation learning on MMGs, providing a unified embedding space. UniGraph2 employs modality-specific encoders alongside a graph neural network (GNN) to learn a unified low-dim
Authors
(none)
Tags
Stats
Related papers
- Urbangraphembeddings: Learning And Evaluating Spatially Grounded Multimodal Embeddings For Urban Science (2026)0.00
- Deep Unified Multimodal Embeddings For Understanding Both Content And Users In Social Media Networks (2019)0.00
- Breaking The Modality Barrier: Universal Embedding Learning With Multimodal Llms (2025)4.52
- Unime-v2: Mllm-as-a-judge For Universal Multimodal Embedding Learning (2025)0.00
- MG\(^2\)-RAG: Multi-granularity Graph For Multimodal Retrieval-augmented Generation (2026)0.00
- Multimodal Prediction Based On Graph Representations (2019)0.00
- Magic-mm-embedding: Towards Visual-token-efficient Universal Multimodal Embedding With Mllms (2026)0.00
- Unimoco: Unified Modality Completion For Robust Multi-modal Embeddings (2025)1.40