Awesome Medical AI
Medical AI is one of the most active areas in Awesome AI for Science β 2,537 papers in this collection, evaluated on datasets like ChEMBL, CrossDocked-2020, PoseBusters. A strong starting point is "An AI system to help scientists write expert-level empirical software".
Datasets & benchmarks
Key papers
- An AI system to help scientists write expert-level empirical software (2025)Eser Ayg\"un et al.12.64
- BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology (2025)Ludovico Mitchener et al.10.71
- SmileyLlama: Modifying Large Language Models for Directed Chemical Space Exploration (2024)Joseph M. Cavanagh et al.9.86
- Towards an AI co-scientist (2025)Juraj Gottweis et al.9.77
- Accurate RNA 3D structure prediction using a language model-based deep
learning approach (2022)Tao Shen et al.9.35
- Graph Neural Networks in Modern AI-aided Drug Discovery (2025)Odin Zhang et al.8.75
- Artificial Intelligence and Deep Learning Algorithms for Epigenetic
Sequence Analysis: A Review for Epigeneticists and AI Experts (2025)Muhammad Tahir et al.8.64
- Combining physics-based and data-driven models: advancing the frontiers
of research with Scientific Machine Learning (2025)Alfio Quarteroni and Paola Gervasio and Francesco Regazzoni8.56
- MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based
Scientific Research (2025)James Burgess et al.8.49
- PharmAgents: Building a Virtual Pharma with Large Language Model Agents (2025)Bowen Gao et al.8.07
- AI-Powered Prediction of Nanoparticle Pharmacokinetics: A Multi-View
Learning Approach (2025)Amirhossein Khakpour et al.7.82
- HEIST: A Graph Foundation Model for Spatial Transcriptomics and Proteomics Data (2025)Hiren Madhu et al.7.70
- RiNALMo: General-Purpose RNA Language Models Can Generalize Well on Structure Prediction Tasks (2024)Rafael Josip Peni\'c et al.7.62
- Physics-informed graph neural networks for flow field estimation in carotid arteries (2024)Julian Suk et al.7.55
- SE(3)-Equivariant Ternary Complex Prediction Towards Target Protein
Degradation (2025)Fanglei Xue et al.7.50
- Collaborative Expert LLMs Guided Multi-Objective Molecular Optimization (2025)Jiajun Yu et al.7.40
- drGT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network (2024)Yoshitaka Inoue et al.7.38
- Unified modeling of 3D molecular generation via atomic interactions with PocketXMol. (2026)Xingang Peng et al.7.24
- Multi-view biomedical foundation models for molecule-target and property prediction (2024)Parthasarathy Suryanarayanan et al.7.23
- Contextualizing biological perturbation experiments through language (2025)Menghua Wu et al.7.19
- Representation Meets Optimization: Training PINNs and PIKANs for Gray-Box Discovery in Systems Pharmacology (2025)Nazanin Ahmadi Daryakenari et al.7.13
- FlowDock: Geometric Flow Matching for Generative Protein-Ligand Docking
and Affinity Prediction (2024)Alex Morehead and Jianlin Cheng7.08
- BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation
Experiments (2024)Yusuf Roohani et al.7.00
- Physics-Informed Machine Learning in Biomedical Science and Engineering (2025)Nazanin Ahmadi et al.6.86
- Physics-informed deep learning for infectious disease forecasting (2025)Ying Qian et al.6.78
- BioNeMo Framework: a modular, high-performance library for AI model development in drug discovery (2024)Peter St. John et al.6.67
- SMILES-Mamba: Chemical Mamba Foundation Models for Drug ADMET Prediction (2024)Bohao Xu et al.6.61
- QuantumBind-RBFE: Accurate Relative Binding Free Energy Calculations
Using Neural Network Potentials (2025)Francesc Saban\'es Zariquiey et al.6.58
- PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis (2024)Yan Wu et al.6.50
- Multimodal AI predicts clinical outcomes of drug combinations from preclinical data (2025)Yepeng Huang et al.6.47
- POLO: Preference-Guided Multi-Turn Reinforcement Learning for Lead Optimization (2025)Ziqing Wang et al.6.29
- Robust Inference-Time Steering of Protein Diffusion Models via Embedding Optimization (2026)Minhuan Li et al.6.29
- Machine Learning Enhanced Calculation of Quantum-Classical Binding Free Energies (2025)Moritz Bensberg et al.6.23
- Molecular-driven Foundation Model for Oncologic Pathology (2025)Anurag Vaidya et al.6.12
- Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony (2025)James Bagrow and Josh Bongard6.12
- VALID-Mol: a Systematic Framework for Validated LLM-Assisted Molecular Design (2025)Malikussaid et al.6.12
- scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction (2025)Qing Wang et al.6.07
- Machine Learning Methods for Gene Regulatory Network Inference (2025)Akshata Hegde et al.6.01
- Interpretable Graph Kolmogorov-Arnold Networks for Multi-Cancer Classification and Biomarker Identification using Multi-Omics Data (2025)Fadi Alharbi et al.5.96
- CoTox: Chain-of-Thought-Based Molecular Toxicity Reasoning and Prediction (2025)Jueon Park et al.5.93
- General Binding Affinity Guidance for Diffusion Models in Structure-Based Drug Design (2024)Yue Jian et al.5.91
- Molecular Graph Contrastive Learning with Line Graph (2025)Xueyuan Chen et al.5.84
- AI-driven multi-omics integration for multi-scale predictive modeling of
causal genotype-environment-phenotype relationships (2024)You Wu (1) et al.5.78
- GBDTSVM: Combined Support Vector Machine and Gradient Boosting Decision Tree Framework for efficient snoRNA-disease association prediction (2025)Ummay Maria Muna et al.5.76
- CellVerse: Do Large Language Models Really Understand Cell Biology? (2025)Fan Zhang et al.5.76
- Benchmarking AI scientists for omics data driven biological discovery (2025)Erpai Luo et al.5.76
- Robin: A multi-agent system for automating scientific discovery (2025)Ali Essam Ghareeb et al.5.76
- Universal Biological Sequence Reranking for Improved De Novo Peptide Sequencing (2025)Zijie Qiu et al.5.76
- Multi-modal AI for comprehensive breast cancer prognostication (2024)Jan Witowski et al.5.68
- TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of
Tools (2025)Shanghua Gao et al.5.65
- SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models (2025)Kunyang Sun et al.5.65
- Democratizing AI scientists using ToolUniverse (2025)Shanghua Gao et al.5.63
- Protein Large Language Models: A Comprehensive Survey (2025)Yijia Xiao et al.5.59
- ProteinGPT: Multimodal LLM for Protein Property Prediction and Structure
Understanding (2024)Yijia Xiao et al.5.57
- ImageDDI: Image-enhanced Molecular Motif Sequence Representation for Drug-Drug Interaction Prediction (2025)Yuqin He et al.5.57
- Chemistry42: An AI-based platform for de novo molecular design (2021)Yan A. Ivanenkov et al.5.52
- Unraveling the Potential of Diffusion Models in Small Molecule Generation (2025)Peining Zhang et al.5.52
- ProtChatGPT: Towards Understanding Proteins with Large Language Models (2024)Chao Wang et al.5.51
- EHRNote-ChatQA: A Benchmark for Evidence-Grounded Multi-Turn Clinical Question Answering over Longitudinal Discharge Summaries (2026)Jiyoun Kim et al.5.49
- AbRank: A Benchmark Dataset and Metric-Learning Framework for Antibody-Antigen Affinity Ranking (2025)Chunan Liu et al.5.46