Awesome AI for Science

📄Papers 🧭Topics 🔥Trending 🗺️Map 🏆Leaderboards 🎓Learn 🤖Ask AI

⋯More

👥Authors 📚Reading Packs 📊Datasets 🛠️Tools 📰News 📝Blogs ✉️Newsletter 🎯Research Radar 🔖Saved

← all topics overview

Genomics

loading…

Stay Updated

E-Mail Digest 🎯 Research Radar

Submit a paper · Privacy · Terms

© 2026 Awesome Papers.

Awesome Genomics — curated papers, datasets & benchmarks · Awesome AI for Science

← all topics overview

Awesome Genomics

Genomics is one of the most active areas in Awesome AI for Science — 1,427 papers in this collection, evaluated on datasets like ProteinGym, TCGA, The Cancer Genome Atlas (TCGA). A strong starting point is "An AI system to help scientists write expert-level empirical software".

Datasets & benchmarks

ProteinGym16 papers

The Cancer Genome Atlas (TCGA)5 papers

Protein Data Bank (PDB)4 papers

Protein Data Bank4 papers

SARS-CoV-24 papers

CITE-seq3 papers

Key papers

60 papers · trending (default)numbers = 🔥 heat

An AI system to help scientists write expert-level empirical software (2025)
Eser Ayg\"un et al.
12.53
DART-Eval: A Comprehensive DNA Language Model Evaluation Benchmark on Regulatory DNA (2024)
Aman Patel et al.
11.97
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery (2026)
Shiyang Feng et al.
11.03
BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology (2025)
Ludovico Mitchener et al.
10.60
Accurate RNA 3D structure prediction using a language model-based deep learning approach (2022)
Tao Shen et al.
9.35
Artificial Intelligence and Deep Learning Algorithms for Epigenetic Sequence Analysis: A Review for Epigeneticists and AI Experts (2025)
Muhammad Tahir et al.
8.52
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research (2025)
James Burgess et al.
8.38
Multi-Objective-Guided Discrete Flow Matching for Controllable Biological Sequence Design (2025)
Tong Chen et al.
8.18
Materials Graph Library (MatGL), an open-source graph deep learning library for materials science and chemistry (2025)
Tsz Wai Ko et al.
7.71
Machine Learning Methods for Gene Regulatory Network Inference (2025)
Akshata Hegde et al.
7.64
RiNALMo: General-Purpose RNA Language Models Can Generalize Well on Structure Prediction Tasks (2024)
Rafael Josip Peni\'c et al.
7.62
HEIST: A Graph Foundation Model for Spatial Transcriptomics and Proteomics Data (2025)
Hiren Madhu et al.
7.59
Kosmos: An AI Scientist for Autonomous Discovery (2025)
Ludovico Mitchener et al.
7.41
drGT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network (2024)
Yoshitaka Inoue et al.
7.38
A Text-guided Protein Design Framework (2023)
Shengchao Liu et al.
7.32
Gumbel-Softmax Flow Matching with Straight-Through Guidance for Controllable Biological Sequence Generation (2025)
Sophia Tang et al.
7.13
Contextualizing biological perturbation experiments through language (2025)
Menghua Wu et al.
7.08
BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments (2024)
Yusuf Roohani et al.
7.00
Understanding protein function with a multimodal retrieval-augmented foundation model (2025)
Timothy Fei Truong Jr et al.
6.64
PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis (2024)
Yan Wu et al.
6.50
RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design (2024)
Rishabh Anand et al.
6.39
Interpretable Graph Kolmogorov-Arnold Networks for Multi-Cancer Classification and Biomarker Identification using Multi-Omics Data (2025)
Fadi Alharbi et al.
6.12
Molecular-driven Foundation Model for Oncologic Pathology (2025)
Anurag Vaidya et al.
6.01
Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony (2025)
James Bagrow and Josh Bongard
6.01
DualEquiNet: A Dual-Space Hierarchical Equivariant Network for Large Biomolecules (2025)
Junjie Xu et al.
6.01
scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction (2025)
Qing Wang et al.
5.96
Democratizing AI scientists using ToolUniverse (2025)
Shanghua Gao et al.
5.87
Whole-Genome Phenotype Prediction with Machine Learning: Open Problems in Bacterial Genomics (2025)
Tamsin James et al.
5.79
AI-driven multi-omics integration for multi-scale predictive modeling of causal genotype-environment-phenotype relationships (2024)
You Wu (1) et al.
5.78
Iterative Distillation for Reward-Guided Fine-Tuning of Diffusion Models in Biomolecular Design (2025)
Xingyu Su et al.
5.76
Multi-modal AI for comprehensive breast cancer prognostication (2024)
Jan Witowski et al.
5.68
GBDTSVM: Combined Support Vector Machine and Gradient Boosting Decision Tree Framework for efficient snoRNA-disease association prediction (2025)
Ummay Maria Muna et al.
5.65
CellVerse: Do Large Language Models Really Understand Cell Biology? (2025)
Fan Zhang et al.
5.65
Benchmarking AI scientists for omics data driven biological discovery (2025)
Erpai Luo et al.
5.65
Universal Biological Sequence Reranking for Improved De Novo Peptide Sequencing (2025)
Zijie Qiu et al.
5.65
LOCO-EPI: Leave-one-chromosome-out (LOCO) as a benchmarking paradigm for deep learning based prediction of enhancer-promoter interactions (2025)
Muhammad Tahir et al.
5.59
Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents (2025)
Shuo Ren et al.
5.54
Protein Large Language Models: A Comprehensive Survey (2025)
Yijia Xiao et al.
5.48
Hyperbolic Genome Embeddings (2025)
Raiyan R. Khan et al.
5.40
MAMMAL -- Molecular Aligned Multi-Modal Architecture and Language (2024)
Yoel Shoshan et al.
5.37
Diffusion on language model encodings for protein sequence generation (2024)
Viacheslav Meshchaninov et al.
5.29
JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model (2025)
Qihao Duan et al.
5.29
Transformer-Based Representation Learning for Robust Gene Expression Modeling and Cancer Prognosis (2025)
Shuai Jiang et al.
5.24
On learning functions over biological sequence space: relating Gaussian process priors, regularization, and gauge fixing (2025)
Samantha Petti et al.
5.24
Differentiable Folding for Nearest Neighbor Model Optimization (2025)
Ryan K. Krueger et al.
5.18
In-silico biological discovery with large perturbation models (2025)
Djordje Miladinovic et al.
5.18
HybriDNA: A Hybrid Transformer-Mamba2 Long-Range DNA Language Model (2025)
Mingqian Ma et al.
5.13
Learning to Discover Regulatory Elements for Gene Expression Prediction (2025)
Xingyu Su et al.
5.13
Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge (2023)
Gilchan Park et al.
5.12
LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet? (2025)
Rushil Gupta et al.
5.10
Benchmarking and Evaluation of AI Models in Biology: Outcomes and Recommendations from the CZI Virtual Cells Workshop (2025)
Elizabeth Fahsbender et al.
4.98
Virtual Cells: Predict, Explain, Discover (2025)
Emmanuel Noutahi et al.
4.87
Flow Matching Meets Biology and Life Science: A Survey (2025)
Zihao Li et al.
4.87
PROTOCOL: Late Interaction Retrieval for Protein Homolog Search (2026)
Gabrielle Cohn et al.
4.84
PLM-eXplain: Divide and Conquer the Protein Embedding Space (2025)
Jan van Eck et al.
4.82
Hallucination, reliability, and the role of generative AI in science (2025)
Charles Rathkopf
4.82
gRNAde: Geometric Deep Learning for 3D RNA inverse design (2023)
Chaitanya K. Joshi et al.
4.79
A Phylogenetic Approach to Genomic Language Modeling (2025)
Carlos Albors et al.
4.76
GENEOnet: Statistical analysis supporting explainability and trustworthiness (2025)
Giovanni Bocchi and Patrizio Frosini and Alessandra Micheletti and Alessandro Pedretti and Carmen Gratteri and Filippo Lunghini and Andrea Rosario Beccari and Carmine Talarico
4.76
BAnG: Bidirectional Anchored Generation for Conditional RNA Design (2025)
Roman Klypa et al.
4.71