📊 Datasets — Awesome Graph Learning

230 datasets & benchmarks — 22 canonical foundations plus emerging datasets mined from recent papers. Each links to the papers that use it.

CiteSeerCanonical

The CiteSeer dataset consists of 3312 scientific publications classified into one of six classes. The citation network consists of 4732 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 3703 unique words. The README file in the dataset provides more details.

📄 29 papers

CoraCanonical

A citation-network benchmark for node classification (papers linked by citations, classified by topic).

📄 29 papers

PubMedCanonical

A citation-network benchmark of biomedical papers for node classification.

📄 25 papers

Open Graph BenchmarkCanonical

A collection of large-scale, realistic graph datasets with a standardized evaluation for graph machine learning.

📄 11 papers

QM9Canonical

134k small organic molecules with computed quantum-chemical properties, for molecular-property prediction.

📄 8 papers

Open Graph Benchmark (OGB)Emerging

The Open Graph Benchmark (OGB) is a collection of large-scale node classification datasets used to evaluate the performance of graph neural networks (GNNs).

📄 7 papers

RedditCanonical

The 'Reddit' dataset is a real-world temporal graph used to evaluate the performance of conformal prediction methods for graph neural networks in capturing uncertainties in dynamic settings.

📄 6 papers

ogbn-papers-100MEmerging

The 'ogbn-papers-100M' dataset contains a large-scale graph of academic papers and is used to evaluate the performance of graph neural networks (GNNs) in representation learning.

📄 5 papers

WN-18RREmerging

WN-18RR is a benchmark dataset used to evaluate knowledge graph completion methods, containing a subset of the WordNet lexical database with relationships between entities.

📄 5 papers

ZINCCanonical

Dataset Card for ZINC Dataset Summary The ZINC dataset is a "curated collection of commercially available chemical compounds prepared especially for virtual screening" (Wikipedia). Supported Tasks and Leaderboards ZINC should be used for molecular property prediction (aiming to predict the constrained solubility of the molecules), a graph regression task. The score used is the MAE. The associated leaderboard is here: Papers with code leaderboard.… See the full description on the dataset page: https://huggingface.co/datasets/graphs-datasets/ZINC.

📄 5 papers

FB-15K-237Emerging

FB-15K-237 is a benchmark dataset for knowledge graph completion that contains a collection of entities and relationships used to evaluate the performance of various models in inferring missing links within knowledge graphs.

📄 4 papers

PPICanonical

The PPI dataset contains protein-protein interaction data and is used to evaluate semi-supervised node classification algorithms.

📄 4 papers

MovieLensEmerging

MovieLens is a dataset used to evaluate recommendation systems, containing user-item interactions that facilitate the assessment of algorithms' performance in predicting user preferences.

📄 3 papers

PCQM4Mv2Emerging

The 'PCQM-4Mv-2' dataset is a benchmark for evaluating graph neural networks (GNNs) through the graph alignment problem, containing synthetic and real-world graph data used to assess model performance across varying task difficulties.

📄 3 papers

ACMEmerging

Audio-Centric Multimodal Benchmark (ACM) This dataset packages the ACM benchmark introduced by Efficient and High-Fidelity Omni Modality Retrieval. The paper is available at arXiv:2603.02098. Dataset repo: chuonghm/ACM ACM contains four HuggingFace subsets. Each subset uses a single test split so query and candidate tables can keep their natural schemas: composed_audio_retrieval_queries: flattened AT2A query rows. composed_audio_retrieval_candidates: AT2A audio candidate pool.… See the full description on the dataset page: https://huggingface.co/datasets/chuonghm/ACM.

📄 2 papers

ActorEmerging

The 'Actor' dataset is a benchmark used to evaluate the performance of graph neural networks (GNNs) on heterophilic graphs, where connected nodes have dissimilar features or labels.

📄 2 papers

Bitcoin-alphaEmerging

The 'Bitcoin-Alpha' dataset is a benchmark that contains feature-sparse signed networks used to evaluate anomaly detection methods in Bitcoin trust systems.

📄 2 papers

Bitcoin-otcEmerging

The 'Bitcoin OTC' dataset contains historical data on over-the-counter Bitcoin trades and is used to evaluate link prediction methods in the context of temporal signed networks.

📄 2 papers

CIFAR-10Emerging

CIFAR-10 is a dataset containing 60,000 32x32 color images across 10 different classes, commonly used to evaluate the performance of machine learning models in image classification tasks.

📄 2 papers

Coauthor CSCanonical

The 'Coauthor CS' dataset contains co-authorship information in the computer science domain and is used to evaluate the effectiveness of unlearning strategies for graph neural networks.

📄 2 papers

Co-author-PhysicsEmerging

The 'Co-author-Physics' dataset is a benchmark used to evaluate the performance of Graph Neural Networks by providing a collection of co-authorship relationships in the field of physics.

📄 2 papers

DBLPEmerging

DBLP is a dataset that contains bibliographic information on computer science publications and is used to evaluate link prediction models in graph mining.

📄 2 papers

FreebaseEmerging

Freebase is a large-scale knowledge graph that contains structured information about entities and their relationships, and it is used to evaluate graph representation learning methods in the context of node classification tasks.

📄 2 papers

MD17Emerging

The MD-17 dataset is a benchmark that contains molecular dynamics data used to evaluate the performance of machine learning interatomic potentials.

📄 2 papers

mini-ImageNetEmerging

Dataset Card for "mini_imagenet" More Information needed

📄 2 papers

MNISTEmerging

Dataset Card for MNIST Dataset Summary The MNIST dataset consists of 70,000 28x28 black-and-white images of handwritten digits extracted from two NIST databases. There are 60,000 images in the training dataset and 10,000 images in the validation dataset, one class per digit so a total of 10 classes, with 7,000 images (6,000 train images and 1,000 test images) per class. Half of the image were drawn by Census Bureau employees and the other half by high school students… See the full description on the dataset page: https://huggingface.co/datasets/ylecun/mnist.

📄 2 papers

MoleculeNetEmerging

MoleculeNet Benchmark (website) MoleculeNet is a benchmark specially designed for testing machine learning methods of molecular properties. As we aim to facilitate the development of molecular machine learning method, this work curates a number of dataset collections, creates a suite of software that implements many known featurizations and previously proposed algorithms. All methods and datasets are integrated as parts of the open source DeepChem package(MIT license). MoleculeNet… See the full description on the dataset page: https://huggingface.co/datasets/katielink/moleculenet-benchmark.

📄 2 papers

NELLEmerging

NELL (Never-Ending Language Learning) is a dataset that contains a continuously growing knowledge base of facts extracted from the web, used to evaluate the robustness and generalization of models in graph-based learning tasks.

📄 2 papers

NTU RGB+DEmerging

The NTU-RGB+D dataset is a benchmark that contains skeleton data for human actions and interactions, used to evaluate methods for recognizing both individual actions and human-human interactions.

📄 2 papers

ogbn-arxivCanonical

An OGB citation-network node-classification benchmark over arXiv CS papers.

📄 2 papers

Open Academic GraphEmerging

The Open Academic Graph is a large-scale dataset that contains academic papers and their citation relationships, used to evaluate the performance of graph neural networks in modeling graph-structured data.

📄 2 papers

PlanetoidEmerging

The 'Planetoid' dataset is a benchmark used to evaluate link prediction tasks in graph learning, containing graph-structured data that facilitates the assessment of various heuristic learning methods.

📄 2 papers

PROTEINSCanonical

Dataset Card for PROTEINS Dataset Summary The PROTEINS dataset is a medium molecular property prediction dataset. Supported Tasks and Leaderboards PROTEINS should be used for molecular property prediction (aiming to predict whether molecules are enzymes or not), a binary classification task. The score used is accuracy, using a 10-fold cross-validation. External Use PyGeometric To load in PyGeometric, do the following: from datasets import… See the full description on the dataset page: https://huggingface.co/datasets/graphs-datasets/PROTEINS.

📄 2 papers

SlashdotEmerging

The Slashdot dataset contains user interactions and relationships within the Slashdot social news website and is used to evaluate link sign prediction in signed graph neural networks.

📄 2 papers

TexasEmerging

The 'Texas' dataset is a benchmark used to evaluate the performance of Graph Neural Networks (GNNs) on heterophilic graphs.

📄 2 papers

WisconsinEmerging

The 'Wisconsin' dataset is a benchmark used to evaluate node-level representation learning methods on heterophilic graphs, specifically assessing their ability to capture structural equivalence and hidden relationships among nodes.

📄 2 papers

11 benchmarkEmerging

The '11 benchmark' is a dataset used to evaluate the performance of Graph Neural Networks (GNNs) on various graph analytical tasks, particularly in distinguishing between homophilic and heterophilic linking patterns.

📄 1 paper

11 benchmark graphsEmerging

The '11 benchmark graphs' dataset contains a collection of diverse graph structures used to evaluate the effectiveness of graph contrastive learning methods.

📄 1 paper

12 regular and geometric graph benchmarksEmerging

The '12 regular and geometric graph benchmarks' dataset contains a variety of graph structures used to evaluate the performance of interpretable graph neural networks (XGNNs) in tasks such as graph classification.

📄 1 paper

13 benchmarksEmerging

The '13 benchmarks' dataset consists of various graph datasets used to evaluate the performance of graph neural networks, particularly in terms of accuracy on both homophilic and heterophilic graphs.

📄 1 paper

14 node classification benchmarksEmerging

The '14 node classification benchmarks' consists of various datasets used to evaluate the performance of Graph Neural Networks (GNNs) in classifying nodes within graphs.

📄 1 paper

152 distinct graph datasetsEmerging

The '152 distinct graph datasets' refers to a collection of diverse graph datasets used to evaluate the scalability and generalizability of the GraphFM model, encompassing various domains such as molecules, citation networks, and product graphs, and totaling 7.4 million nodes and 189 million edges.

📄 1 paper

19 real-world networksEmerging

The '19 real-world networks' dataset contains various real-world network structures used to evaluate the performance of algorithms in predicting node importance, particularly through centrality measures like betweenness centrality.

📄 1 paper

1 syntheticEmerging

The '1 synthetic' dataset/benchmark is used to evaluate the performance of the proposed Edge Splitting GNN (ES-GNN) framework in distinguishing between task-relevant and irrelevant edges in graph neural networks.

📄 1 paper

26 UEA benchmark datasetsEmerging

The '26 UEA benchmark datasets' consists of multivariate time series data used to evaluate the performance of classification algorithms in the context of multivariate time series classification (MTSC).

📄 1 paper

3D point cloud dataEmerging

3D point cloud data consists of a set of points in three-dimensional space that represent the shape and structure of objects, and it is used to evaluate graph-based learning methods for tasks such as segmentation and classification.

📄 1 paper

3DPWEmerging

The '3DPW' dataset is a benchmark that contains 3D human pose data used to evaluate human motion prediction models.

📄 1 paper

6 standard node classification benchmarksEmerging

The "6 standard node classification benchmarks" refers to a set of datasets used to evaluate the performance of Graph Neural Networks (GNNs) in classifying nodes based on their features and neighborhood structures.

📄 1 paper

7 widely-used graph datasetsEmerging

The '7 widely-used graph datasets' refers to a collection of benchmark datasets utilized to evaluate the performance of Graph Neural Networks (GNNs) and their susceptibility to the over-smoothing problem.

📄 1 paper

8 common real-world networksEmerging

The '8 common real-world networks' dataset/benchmark contains a collection of diverse network structures used to evaluate the performance and degree bias of Graph Neural Networks (GNNs) in node classification tasks.

📄 1 paper

9 collected multi-label datasetsEmerging

The '9 collected multi-label datasets' refers to a set of datasets used to evaluate the performance of various methods in multi-label node classification tasks, addressing the challenges posed by the scarcity of publicly available datasets in this area.

📄 1 paper

9 graph datasetsEmerging

The '9 graph datasets' is a benchmark containing various graph data used to evaluate the performance of quantization-aware training methods for Graph Neural Networks.

📄 1 paper

ADNIEmerging

The ADNI (Alzheimer's Disease Neuroimaging Initiative) dataset contains MRI data and associated demographic and genetic information from subjects, and it is used to evaluate models for Alzheimer's disease classification and causal analysis of disease progression.

📄 1 paper

Alibaba mobile applicationEmerging

The 'Alibaba mobile application' dataset is a real-world heterogeneous graph used to evaluate node classification, link prediction, and online A/B testing methodologies.

📄 1 paper

Alibaba recommendation datasetEmerging

The Alibaba recommendation dataset is an industrial-sized dataset used to evaluate the performance of recommendation algorithms, particularly in the context of node classification tasks.

📄 1 paper

AliExpressEmerging

The 'AliExpress' dataset is a benchmark that contains structured data used to evaluate link prediction methods in recommendation systems, particularly focusing on the incorporation of edge attributes in large-scale sparse graphs.

📄 1 paper

AlipayEmerging

The 'Alipay' dataset is a benchmark used to evaluate the performance of algorithms in graph learning, specifically in the context of graph-structured data related to user interactions and transactions within the Alipay platform.

📄 1 paper

AmazonEmerging

The 'Amazon' dataset is a real-world fraud dataset used to evaluate the effectiveness of verification methods for graph neural networks against adversarial attacks.

📄 1 paper

Amazon-2MEmerging

The 'Amazon-2M' dataset contains 2 million nodes and 61 million edges, and it is used to evaluate the scalability and efficiency of the Cluster-GCN algorithm for training deep graph convolutional networks.

📄 1 paper

Amazon ComputersCanonical

The 'Amazon Computers' dataset is a benchmark used to evaluate the performance of graph neural networks in the context of knowledge distillation and collaborative training.

📄 1 paper

Loading datasets…