Awesome Information Retrieval

La representaci\'on de la variaci\'on contextual mediante definiciones terminol\'ogicas flexibles (2016)

Antonio San Mart\'in

6.34

Beyond Parallel Sampling: Diverse Query Initialization for Agentic Search (2026)

Sidhaarth Murali et al.

6.23

Trait, Not State: The Durability of Reading Identity in Social Highlighting (2026)

Kazuki Nakayashiki et al.

5.89

M3: Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis (2025)

Rafi Al Attrach et al.

5.72

Context-aware Entity-Relation Extraction for Threat Intelligence Knowledge Graphs (2026)

Inoussa Mouiche et al.

5.58

Cartridges at Scale: Training Modular KV Caches over Large Document Collections (2026)

Momchil Hardalov et al.

5.49

How Fine-Grained Should a RAG Benchmark Be? A Hierarchical Framework for Synthetic Question Generation (2026)

Chase M. Fensore et al.

5.49

On the Memorization Behavior of LLMs in Generative Recommendation: Observations, Implications, and Training Strategies (2026)

Sunwoo Kim et al.

5.49

Temporal Preference Optimization for Unsupervised Retrieval (2026)

HyunJin Kim et al.

5.49

Non-negative Elastic Net Decoding for Information Retrieval (2026)

Koki Okajima et al.

5.49

A Unified Framework for Context-Aware and Relation-Aware Graph Retrieval-Augmented Generation (2026)

Haoyang Zhong et al.

5.49

RankGraph-2: Lifecycle Co-Design for Billion-Node Graph Learning in Recommendation (2026)

Renzhi Wu et al.

5.49

Lost in a Single Vector: Improving Long-Document Retrieval with Chunk Evidence Aggregation (2026)

Shanshan Lyu et al.

5.49

MetaConfigurator: AI-Assisted RDF Authoring from JSON Data (2026)

Felix Neubauer et al.

5.01

Doc-to-Atom: Learning to Compile and Compose Memory Atoms (2026)

Xingjian Diao et al.

5.01

IUU+DB: Tracking Illegal, Unreported, and Unregulated Fishing, Seafood Fraud, and Labor Abuse through LLM-driven Information Extraction (2026)

Henry Bodwell et al.

5.01

Beyond Text and Tables: Vision-Language Model Integration in ComProScanner for Extracting Materials Data from Scientific Figures with High Accuracy (2026)

Aritra Roy et al.

4.39

Graph2Idea:Retrieval-Augmented Scientific Idea Generation with Graph-Structured Contexts (2026)

Xu Li et al.

4.39

A PubMed-Scale Dataset of Structured Biomedical Abstracts (2026)

Chia-Hsuan Chang et al.

4.39

Cross-Dataset Bloom Question Classification: Supervised Models and Prompted LLMs (2026)

Abdolali Faraji et al.

4.39

Hyperdimensional computing for structured querying on tabular data embeddings (2026)

Sebasti\'an Bugedo et al.

4.39

Context-aware Modality-Topology Co-Alignment for Multimodal Attributed Graphs (2026)

Sirui Zhang et al.

4.39

Context Compression Is Not One Thing: Readable Symbolic Re-expression vs. Coherent Summary at Matched Budget (2026)

Sisong Bei et al.

4.39

Semantics-Enhanced Retrieval-Augmented Time Series Forecasting (2026)

Shiqiao Zhou et al.

4.39

Semantic Reasoning in Medicine: The Role of Knowledge Graphs Across Five Key Domains (2026)

Haniye Sherafatmandjoo et al.

4.39

Provenance-Enhanced Statements in Knowledge Graphs (2026)

Fabio Vitali et al.

4.39

Guiding Federated Graph Recommendation with LLM-encoded knowledge (2026)

Thi Minh Chau Nguyen et al.

4.39

Few-Shot Biomedical Relation Extraction with Large Language Models: A Viable Alternative to Supervised Learning? (2026)

Jakob Mraz et al.

4.39

Encode Errors: Representational Retrieval of In-Context Demonstrations for Multilingual Grammatical Error Correction (2026)

Guangyue Peng et al.

4.39

Ricci-Filtration: Boosting Retrieval-Augmented Generation Reranker to Query-Answer Tasks by Discrete Ricci Flow (2026)

Tian Qin et al.

4.39

AthDGC: An Open Diachronic Greek Treebank with Indo-European Parallels (2026)

Nikolaos Lavidas et al.

4.39

Overcoming the Impedance Mismatch: A Theoretical Roadmap for Fusing Foundation Models and Knowledge Graphs (2026)

Sahil Rajesh Dhayalkar

4.39

Vernier: Probing Representational Misalignment Behind Lexical Gaps in Causal Reasoning (2026)

Zhenyu Yu

4.39

A Self Consistency Based Reranking for Narrative Question Answering (2026)

Molham Mohamed et al.

4.39

Entity Labels Are Not Entity Signals: A Framework for Observable Relevance in Document Re-Ranking (2026)

Utshab Kumar Ghosh et al.

4.39

IBAD: Interpretable Behavioral Anomaly Detection on Human Mobility Data (2026)

Bita Azarijoo et al.

4.39

SCAR: Semantic Continuity-Aware Retrieval for Efficient Context Expansion in RAG (2026)

Nathana\"el Langlois

4.39

RAID: Semantic Graph Diffusion for True Cold-Start and Cross-Lingual Forecasting (2026)

Arunkumar V et al.

4.39

How Much Do Reviews Really Contribute? A Study on Text-Enriched Matrix Factorization for Recommendations (2026)

Eduardo Ferreira da Silva et al.

4.39

Want Better Synthetic Data? Steer It: Activation Steering for Low-Resource Language Generation (2026)

Jan Cegin et al.

4.39

BCL: Bayesian In-Context Learning Framework for Information Extraction (2026)

Haoliang Liu et al.

4.39

SHIFT: Semantic Harmonization via Index-side Feature Transformation for Multilingual Information Retrieval (2026)

Youngjoon Jang et al.

4.39

ScholarSum: Student-Teacher Abstractive Summarization via Knowledge Graph Reasoning and Reflective Refinement (2026)

Bohou Zhang et al.

4.39

Aligning Implied Statements for Implicit Hate Speech Generalizability with Context-Bounded Semi-hard Negative Mining (2026)

Wicaksono Leksono Muhamad et al.

4.39

Approximate Structured Diffusion for Sequence Labelling (2026)

Nicolas Floquet et al.

4.39

Efficient Financial Language Understanding via Distillation with Synthetic Data (2026)

Wen-Fong (Xavier) et al.

4.39

Improving Medical Communication using Rubric-Guided Counterfactual Recommendations (2026)

Adrian Cosma et al.

4.39

Learning Robust Pair Confidence for Multimodal Emotion-Cause Pair Extraction (2026)

Zhuangzhuang Pan et al.

4.39

SAERec: Constructing Fine-grained Interpretable Intents Priors via Sparse Autoencoders for Recommendation (2026)

Jiangnan Xia et al.

4.39

Zero-Shot Active Feature Acquisition via LLM-Elicitation (2026)

Binyamin Perets et al.

4.39

Beyond Tokenization: Direct Timestep Embedding and Contrastive Alignment for Time-Series Question Answering (2026)

Yafeng Wu et al.

4.39

JourneyFormer: Encoding Airbnb Guest Journey with Sequence Modeling (2026)

Daochen Zha et al.

4.39

The More the Merrier: Combining Properties for ABox Abduction under Repair Semantics for ELbot (2026)

Anselm Haak et al.

4.39

LECTOR: Joint Optimization of Scientific Reasoning Graphs and Introduction Generation (2026)

Jiabei Xiao et al.

4.33

Eliot: Interactively $\underline{E}$xploring Fast-Changing Scientific $\underline{Li}$terature Trends with $\underline{O}$nline Da$\underline{t}$a and Learning (2026)

Bernardo A. Denkvitts et al.

4.33

SETUP: Sentence-level English-To-Uniform Meaning Representation Parser (2026)

Emma Markle et al.

4.26

DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering (2025)

Rong Cheng et al.

3.75

Effective Reinforcement Learning for Agentic Search by Recycling Zero-Variance Queries During Training (2026)

Jo\~ao Coelho et al.

3.51

Benchmarking Large Language Models for Safety Data Extraction (2026)

Jonas Grill et al.

3.51

The Culture Funnel: You Can't Align What isn't in the Data (2026)

Ananya Sahu et al.

3.51

Awesome Information Retrieval

Datasets & benchmarks

Key papers