Awesome In-Context Learning
In-Context Learning is one of the most active areas in Awesome LLM Papers β 2,641 papers in this collection, evaluated on datasets like GSM8K, LongBench, RULER. A strong starting point is "React: Synergizing Reasoning And Acting In Language Models".
Datasets & benchmarks
Key papers
- React: Synergizing Reasoning And Acting In Language Models (2022)Shunyu Yao, Jeffrey Zhao, Dian Yu, et al.36.63
- BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (2022)BigScience Workshop: Teven Le Scao, Angela Fan, Christopher Akiki, et al.33.79
- DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (2025)DeepSeek-AI et al.31.23
- Deja Vu: Contextual Sparsity For Efficient Llms At Inference Time (2023)Zichang Liu, Jue Wang, Tri Dao, et al.29.70
- Adaptive-rag: Learning To Adapt Retrieval-augmented Large Language Models Through Question Complexity (2024)Soyeong Jeong, Jinheon Baek, Sukmin Cho, et al.29.58
- Emergent Abilities Of Large Language Models (2022)Jason Wei, Yi Tay, Rishi Bommasani, et al.29.54
- RLAIF Vs. RLHF: Scaling Reinforcement Learning From Human Feedback With AI Feedback (2023)Harrison Lee, Samrat Phatale, Hassan Mansoor, et al.29.41
- Scaling Synthetic Data Creation With 1,000,000,000 Personas (2024)Tao Ge, Xin Chan, Xiaoyang Wang, et al.29.19
- Autogen: Enabling Next-gen LLM Applications Via Multi-agent Conversation (2023)Qingyun Wu, Gagan Bansal, Jieyu Zhang, et al.29.16
- Buffer Of Thoughts: Thought-augmented Reasoning With Large Language Models (2024)Ling Yang, Zhaochen Yu, Tianjun Zhang, et al.27.84
- Eureka: Human-level Reward Design Via Coding Large Language Models (2023)Yecheng Jason Ma, William Liang, Guanzhi Wang, et al.27.82
- Graph Of Thoughts: Solving Elaborate Problems With Large Language Models (2023)MacIej Besta, Nils Blach, Ales Kubicek, et al.27.79
- Chain-of-verification Reduces Hallucination In Large Language Models (2023)Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, et al.27.71
- Duoattention: Efficient Long-context LLM Inference With Retrieval And Streaming Heads (2024)Guangxuan Xiao, Jiaming Tang, Jingwei Zuo, et al.27.44
- EASYTOOL: Enhancing Llm-based Agents With Concise Tool Instruction (2024)Siyu Yuan, Kaitao Song, Jiangjie Chen, et al.27.38
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" (2023)Lukas Berglund, Meg Tong, Max Kaufmann, et al.27.32
- Agentgym: Evolving Large Language Model-based Agents Across Diverse Environments (2024)Zhiheng Xi, Yiwen Ding, Wenxiang Chen, et al.27.20
- Gsm-symbolic: Understanding The Limitations Of Mathematical Reasoning In Large Language Models (2024)Iman Mirzadeh, Keivan Alizadeh, Hooman Shahrokhi, et al.27.04
- Lmsys-chat-1m: A Large-scale Real-world LLM Conversation Dataset (2023)Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, et al.26.82
- Learning From Mistakes Makes LLM Better Reasoner (2023)Shengnan An, Zexiong Ma, Zeqi Lin, et al.26.19
- The Unlocking Spell On Base Llms: Rethinking Alignment Via In-context Learning (2023)Bill Yuchen Lin, Abhilasha Ravichander, Ximing Lu, et al.26.03
- Memgpt: Towards Llms As Operating Systems (2023)Charles Packer, Sarah Wooders, Kevin Lin, et al.25.76
- Babilong: Testing The Limits Of Llms With Long Context Reasoning-in-a-haystack (2024)Yuri Kuratov, Aydar Bulatov, Petr Anokhin, et al.25.74
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models (2023)Yilin Wen, Zifeng Wang, Jimeng Sun25.51
- Teaching Large Language Models To Reason With Reinforcement Learning (2024)Alex Havrilla, Yuqing Du, Sharath Chandra Raparthy, et al.25.08
- Reconcile: Round-table Conference Improves Reasoning Via Consensus Among Diverse Llms (2023)Justin Chih-Yao Chen, Swarnadeep Saha, Mohit Bansal24.61
- From Crowdsourced Data To High-quality Benchmarks: Arena-hard And Benchbuilder Pipeline (2024)Tianle Li, Wei-Lin Chiang, Evan Frick, et al.24.39
- Advancing Transformer Architecture In Long-context Large Language Models: A Comprehensive Survey (2023)Yunpeng Huang, Jingwei Xu, Junyu Lai, et al.24.27
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications (2024)Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, et al.24.27
- Tell Your Model Where To Attend: Post-hoc Attention Steering For Llms (2023)Qingru Zhang, Chandan Singh, Liyuan Liu, et al.24.26
- Simulating Classroom Education With Llm-empowered Agents (2024)Zheyuan Zhang, Daniel Zhang-Li, Jifan Yu, et al.24.11
- Same Task, More Tokens: The Impact Of Input Length On The Reasoning Performance Of Large Language Models (2024)Mosh Levy, Alon Jacoby, Yoav Goldberg23.56
- Tool Learning With Large Language Models: A Survey (2024)Changle Qu, Sunhao Dai, Xiaochi Wei, et al.23.52
- Faithful Logical Reasoning Via Symbolic Chain-of-thought (2024)Jundong Xu, Hao Fei, Liangming Pan, et al.23.51
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation (2023)Bowen Zheng, Yupeng Hou, Hongyu Lu, et al.23.45
- You Only Look At Screens: Multimodal Chain-of-action Agents (2023)Zhuosheng Zhang, Aston Zhang23.42
- Graph Chain-of-thought: Augmenting Large Language Models By Reasoning On Graphs (2024)Bowen Jin, Chulin Xie, Jiawei Zhang, et al.23.21
- Dyval: Dynamic Evaluation Of Large Language Models For Reasoning Tasks (2023)Kaijie Zhu, Jiaao Chen, Jindong Wang, et al.23.15
- Expel: LLM Agents Are Experiential Learners (2023)Andrew Zhao, Daniel Huang, Quentin Xu, et al.23.10
- Longllmlingua: Accelerating And Enhancing Llms In Long Context Scenarios Via Prompt Compression (2023)Huiqiang Jiang, Qianhui Wu, Xufang Luo, et al.23.08
- Fact, Fetch, And Reason: A Unified Evaluation Of Retrieval-augmented Generation (2024)Satyapriya Krishna, Kalpesh Krishna, Anhad Mohananey, et al.22.96
- Stepcoder: Improve Code Generation With Reinforcement Learning From Compiler Feedback (2024)Shihan Dou, Yan Liu, Haoxiang Jia, et al.22.83
- Cumulative Reasoning With Large Language Models (2023)Yifan Zhang, Jingqin Yang, Yang Yuan, et al.22.83
- Llms Can Easily Learn To Reason From Demonstrations Structure, Not Content, Is What Matters! (2025)Dacheng Li, Shiyi Cao, Tyler Griggs, et al.22.81
- Prompt Engineering A Prompt Engineer (2023)Qinyuan Ye, Maxamed Axmed, Reid Pryzant, et al.22.48
- Pose: Efficient Context Window Extension Of Llms Via Positional Skip-wise Training (2023)Dawei Zhu, Nan Yang, Liang Wang, et al.22.43
- How Do Large Language Models Acquire Factual Knowledge During Pretraining? (2024)Hoyeon Chang, Jinho Park, Seonghyeon Ye, et al.22.35
- MINT: Evaluating Llms In Multi-turn Interaction With Tools And Language Feedback (2023)Xingyao Wang, Zihan Wang, Jiateng Liu, et al.22.11
- Leave No Document Behind: Benchmarking Long-context Llms With Extended Multi-doc QA (2024)Minzheng Wang, Longze Chen, Cheng Fu, et al.22.07
- Large Language Model Cascades With Mixture Of Thoughts Representations For Cost-efficient Reasoning (2023)Murong Yue, Jie Zhao, Min Zhang, et al.22.06
- In-context Pretraining: Language Modeling Beyond Document Boundaries (2023)Weijia Shi, Sewon Min, Maria Lomeli, et al.21.97
- Rankrag: Unifying Context Ranking With Retrieval-augmented Generation In Llms (2024)Yue Yu, Wei Ping, Zihan Liu, et al.21.80
- In-context Vectors: Making In Context Learning More Effective And Controllable Through Latent Space Steering (2023)Sheng Liu, Haotian Ye, Lei Xing, et al.21.77
- List Items One By One: A New Data Source And Learning Paradigm For Multimodal Llms (2024)An Yan, Zhengyuan Yang, Junda Wu, et al.21.63
- Self-reflection In LLM Agents: Effects On Problem-solving Performance (2024)Matthew Renze, Erhan Guven21.59
- Link-context Learning For Multimodal Llms (2023)Yan Tai, Weichen Fan, Zhao Zhang, et al.21.31
- From RAG To Memory: Non-parametric Continual Learning For Large Language Models (2025)Bernal JimΓ©nez GutiΓ©rrez, Yiheng Shu, Weijian Qi, et al.21.27
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models (2023)Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, et al.21.22
- PPTC Benchmark: Evaluating Large Language Models For Powerpoint Task Completion (2023)Yiduo Guo, Zekai Zhang, Yaobo Liang, et al.21.19
- RLEF: Grounding Code Llms In Execution Feedback With Reinforcement Learning (2024)Jonas Gehring, Kunhao Zheng, Jade Copet, et al.21.18