Awesome RAG
RAG is one of the most active areas in Awesome LLM Papers β 3,082 papers in this collection, evaluated on datasets like GSM8K, MMLU, HotpotQA. A strong starting point is "Efficient Memory Management For Large Language Model Serving With Pagedattention".
Datasets & benchmarks
Key papers
- Efficient Memory Management For Large Language Model Serving With Pagedattention (2023)Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, et al.46.56
- Self-rag: Learning To Retrieve, Generate, And Critique Through Self-reflection (2023)Akari Asai, Zeqiu Wu, Yizhong Wang, et al.33.82
- Prefix-tuning: Optimizing Continuous Prompts For Generation (2021)Xiang Lisa Li, Percy Liang32.38
- Adaptive-rag: Learning To Adapt Retrieval-augmented Large Language Models Through Question Complexity (2024)Soyeong Jeong, Jinheon Baek, Sukmin Cho, et al.29.58
- Scaling Synthetic Data Creation With 1,000,000,000 Personas (2024)Tao Ge, Xin Chan, Xiaoyang Wang, et al.29.19
- Multihop-rag: Benchmarking Retrieval-augmented Generation For Multi-hop Queries (2024)Yixuan Tang, Yi Yang27.60
- Duoattention: Efficient Long-context LLM Inference With Retrieval And Streaming Heads (2024)Guangxuan Xiao, Jiaming Tang, Jingwei Zuo, et al.27.44
- Flashrag: A Modular Toolkit For Efficient Retrieval-augmented Generation Research (2024)Jiajie Jin, Yutao Zhu, Guanting Dong, et al.26.95
- Saullm-7b: A Pioneering Large Language Model For Law (2024)Pierre Colombo, Telmo Pessoa Pires, Malik Boudiaf, et al.26.00
- Scaling Retrieval-based Language Models With A Trillion-token Datastore (2024)Rulin Shao, Jacqueline He, Akari Asai, et al.25.71
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation (2023)Tu Vu, Mohit Iyyer, Xuezhi Wang, et al.25.58
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models (2023)Yilin Wen, Zifeng Wang, Jimeng Sun25.51
- Cacheblend: Fast Large Language Model Serving For RAG With Cached Knowledge Fusion (2024)Jiayi Yao, Hanchen Li, Yuhan Liu, et al.25.46
- Ragas: Automated Evaluation Of Retrieval Augmented Generation (2023)Shahul Es, Jithin James, Luis Espinosa-Anke, et al.24.85
- Controlllm: Augment Language Models With Tools By Searching On Graphs (2023)Zhaoyang Liu, Zeqiang Lai, Zhangwei Gao, et al.24.66
- Layoutllm: Layout Instruction Tuning With Large Language Models For Document Understanding (2024)Chuwei Luo, Yufan Shen, Zhaoqing Zhu, et al.24.66
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! (2023)Xiangyu Qi, Yi Zeng, Tinghao Xie, et al.24.31
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications (2024)Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, et al.24.27
- Federatedscope-llm: A Comprehensive Package For Fine-tuning Large Language Models In Federated Learning (2023)Weirui Kuang, Bingchen Qian, Zitao Li, et al.24.26
- Tell Your Model Where To Attend: Post-hoc Attention Steering For Llms (2023)Qingru Zhang, Chandan Singh, Liyuan Liu, et al.24.26
- Patchscopes: A Unifying Framework For Inspecting Hidden Representations Of Language Models (2024)Asma Ghandeharioun, Avi Caciularu, Adam Pearce, et al.23.87
- Disc-lawllm: Fine-tuning Large Language Models For Intelligent Legal Services (2023)Shengbin Yue, Wei Chen, Siyuan Wang, et al.23.76
- Optimize Weight Rounding Via Signed Gradient Descent For The Quantization Of Llms (2023)Wenhua Cheng, Weiwei Zhang, Haihao Shen, et al.23.56
- Tool Learning With Large Language Models: A Survey (2024)Changle Qu, Sunhao Dai, Xiaochi Wei, et al.23.52
- Graph Chain-of-thought: Augmenting Large Language Models By Reasoning On Graphs (2024)Bowen Jin, Chulin Xie, Jiawei Zhang, et al.23.21
- Expel: LLM Agents Are Experiential Learners (2023)Andrew Zhao, Daniel Huang, Quentin Xu, et al.23.10
- Fact, Fetch, And Reason: A Unified Evaluation Of Retrieval-augmented Generation (2024)Satyapriya Krishna, Kalpesh Krishna, Anhad Mohananey, et al.22.96
- Shepherd: A Critic For Language Model Generation (2023)Tianlu Wang, Ping Yu, Xiaoqing Ellen Tan, et al.22.89
- Visrag: Vision-based Retrieval-augmented Generation On Multi-modality Documents (2024)Shi Yu, Chaoyue Tang, Bokai Xu, et al.22.87
- Reasoning On Graphs: Faithful And Interpretable Large Language Model Reasoning (2023)Linhao Luo, Yuan-Fang Li, Gholamreza Haffari, et al.22.59
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? (2024)Zorik Gekhman, Gal Yona, Roee Aharoni, et al.22.36
- Memorag: Boosting Long Context Processing With Global Memory-enhanced Retrieval Augmentation (2024)Hongjin Qian, Zheng Liu, Peitian Zhang, et al.22.21
- Ragtruth: A Hallucination Corpus For Developing Trustworthy Retrieval-augmented Language Models (2024)Cheng Niu, Yuanhao Wu, Juno Zhu, et al.22.10
- Structrag: Boosting Knowledge Intensive Reasoning Of Llms Via Inference-time Hybrid Information Structurization (2024)Zhuoqun Li, Xuanang Chen, Haiyang Yu, et al.22.07
- AST-T5: Structure-aware Pretraining For Code Generation And Understanding (2024)Linyuan Gong, Mostafa Elhoushi, Alvin Cheung21.95
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models (2024)Wenqi Fan, Yujuan Ding, Liangbo Ning, et al.21.90
- Rankrag: Unifying Context Ranking With Retrieval-augmented Generation In Llms (2024)Yue Yu, Wei Ping, Zihan Liu, et al.21.80
- From RAG To Memory: Non-parametric Continual Learning For Large Language Models (2025)Bernal JimΓ©nez GutiΓ©rrez, Yiheng Shu, Weijian Qi, et al.21.27
- Fingpt: Large Generative Models For A Small Language (2023)Risto Luukkonen, Ville Komulainen, Jouni Luoma, et al.21.27
- RLEF: Grounding Code Llms In Execution Feedback With Reinforcement Learning (2024)Jonas Gehring, Kunhao Zheng, Jade Copet, et al.21.18
- A Simple And Effective \(L_2\) Norm-based Strategy For KV Cache Compression (2024)Alessio Devoto, Yu Zhao, Simone Scardapane, et al.20.70
- The Good And The Bad: Exploring Privacy Issues In Retrieval-augmented Generation (RAG) (2024)Shenglai Zeng, Jiankun Zhang, Pengfei He, et al.20.64
- Understand What LLM Needs: Dual Preference Alignment For Retrieval-augmented Generation (2024)Guanting Dong, Yutao Zhu, Chenghao Zhang, et al.20.23
- Disc-finllm: A Chinese Financial Large Language Model Based On Multiple Experts Fine-tuning (2023)Wei Chen, Qiushi Wang, Zefei Long, et al.20.11
- Jatmo: Prompt Injection Defense By Task-specific Finetuning (2023)Julien Piet, Maha Alrashed, Chawin Sitawarin, et al.20.06
- Benchmarking Uncertainty Quantification Methods For Large Language Models With Lm-polygraph (2024)Roman Vashurin, Ekaterina Fadeeva, Artem Vazhentsev, et al.20.04
- Can Large Language Models Understand Real-world Complex Instructions? (2023)Qianyu He, Jie Zeng, Wenhao Huang, et al.19.91
- Continual Pre-training For Cross-lingual LLM Adaptation: Enhancing Japanese Language Capabilities (2024)Kazuki Fujii, Taishi Nakamura, Mengsay Loem, et al.19.81
- Fact-checking The Output Of Large Language Models Via Token-level Uncertainty Quantification (2024)Ekaterina Fadeeva, Aleksandr Rubashevskii, Artem Shelmanov, et al.19.78
- Chain-of-note: Enhancing Robustness In Retrieval-augmented Language Models (2023)Wenhao Yu, Hongming Zhang, Xiaoman Pan, et al.19.67
- Clasheval: Quantifying The Tug-of-war Between An Llm's Internal Prior And External Evidence (2024)Kevin Wu, Eric Wu, James Zou19.62
- Stacking Your Transformers: A Closer Look At Model Growth For Efficient LLM Pre-training (2024)Wenyu Du, Tongxu Luo, Zihan Qiu, et al.19.56
- Omnieval: An Omnidirectional And Automatic RAG Evaluation Benchmark In Financial Domain (2024)Shuting Wang, Jiejun Tan, Zhicheng Dou, et al.19.52
- Magicpig: LSH Sampling For Efficient LLM Generation (2024)Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye, et al.19.39
- Faitheval: Can Your Language Model Stay Faithful To Context, Even If "the Moon Is Made Of Marshmallows" (2024)Yifei Ming, Senthil Purushwalkam, Shrey Pandit, et al.19.34
- Pearl: Personalizing Large Language Model Writing Assistants With Generation-calibrated Retrievers (2023)Sheshera Mysore, Zhuoran Lu, Mengting Wan, et al.19.23
- Assessing The Brittleness Of Safety Alignment Via Pruning And Low-rank Modifications (2024)Boyi Wei, Kaixuan Huang, Yangsibo Huang, et al.19.00
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference (2023)Siddharth Samsi, Dan Zhao, Joseph McDonald, et al.18.71
- DDK: Distilling Domain Knowledge For Efficient Large Language Models (2024)Jiaheng Liu, Chenchen Zhang, Jinyang Guo, et al.18.67
- Repairagent: An Autonomous, Llm-based Agent For Program Repair (2024)Islem Bouzenia, Premkumar Devanbu, Michael Pradel18.60