Awesome Fine-Tuning
Fine-Tuning is one of the most active areas in Awesome LLM Papers β 5,035 papers in this collection, evaluated on datasets like GSM8K, MMLU, MATH-500. A strong starting point is "Lora: Low-rank Adaptation Of Large Language Models".
Datasets & benchmarks
Key papers
- Lora: Low-rank Adaptation Of Large Language Models (2021)Edward J. Hu, Yelong Shen, Phillip Wallis, et al.48.60
- Zephyr: Direct Distillation Of LM Alignment (2023)Lewis Tunstall, Edward Beeching, Nathan Lambert, et al.38.56
- Training Language Models To Follow Instructions With Human Feedback (2022)Long Ouyang, Jeff Wu, Xu Jiang, et al.36.92
- Qlora: Efficient Finetuning Of Quantized Llms (2023)Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, et al.36.23
- Minicpm: Unveiling The Potential Of Small Language Models With Scalable Training Strategies (2024)Shengding Hu, Yuge Tu, Xu Han, et al.36.01
- Agenttuning: Enabling Generalized Agent Abilities For Llms (2023)Aohan Zeng, Mingdao Liu, Rui Lu, et al.32.67
- Prefix-tuning: Optimizing Continuous Prompts For Generation (2021)Xiang Lisa Li, Percy Liang32.38
- Judgelm: Fine-tuned Large Language Models Are Scalable Judges (2023)Lianghui Zhu, Xinggang Wang, Xinlong Wang31.75
- The Fineweb Datasets: Decanting The Web For The Finest Text Data At Scale (2024)Guilherme Penedo, Hynek KydlΓΔek, Loubna Ben Allal, et al.31.68
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models (2023)Ziyi Lin, Chris Liu, Renrui Zhang, et al.31.35
- Step-dpo: Step-wise Preference Optimization For Long-chain Reasoning Of Llms (2024)Xin Lai, Zhuotao Tian, Yukang Chen, et al.31.31
- Adaptive-rag: Learning To Adapt Retrieval-augmented Large Language Models Through Question Complexity (2024)Soyeong Jeong, Jinheon Baek, Sukmin Cho, et al.29.58
- Qa-lora: Quantization-aware Low-rank Adaptation Of Large Language Models (2023)Yuhui Xu, Lingxi Xie, Xiaotao Gu, et al.29.52
- Prometheus: Inducing Fine-grained Evaluation Capability In Language Models (2023)Seungone Kim, Jamin Shin, Yejin Cho, et al.28.76
- Ultrafeedback: Boosting Language Models With Scaled AI Feedback (2023)Ganqu Cui, Lifan Yuan, Ning Ding, et al.27.73
- Contrastive Preference Optimization: Pushing The Boundaries Of LLM Performance In Machine Translation (2024)Haoran Xu, Amr Sharaf, Yunmo Chen, et al.27.70
- Aya Model: An Instruction Finetuned Open-access Multilingual Language Model (2024)Ahmet ΓstΓΌn, Viraat Aryabumi, Zheng-Xin Yong, et al.27.57
- Magicoder: Empowering Code Generation With Oss-instruct (2023)Yuxiang Wei, Zhe Wang, Jiawei Liu, et al.27.43
- Internlm2 Technical Report (2024)Zheng Cai, Maosong Cao, Haojiong Chen, et al.27.03
- Longalign: A Recipe For Long Context Alignment Of Large Language Models (2024)Yushi Bai, Xin Lv, Jiajie Zhang, et al.26.93
- Dola: Decoding By Contrasting Layers Improves Factuality In Large Language Models (2023)Yung-Sung Chuang, Yujia Xie, Hongyin Luo, et al.26.86
- Datadreamer: A Tool For Synthetic Data Generation And Reproducible LLM Workflows (2024)Ajay Patel, Colin Raffel, Chris Callison-Burch26.77
- A Survey On Large Language Models For Code Generation (2024)Juyong Jiang, Fan Wang, Jiasi Shen, et al.26.64
- Learning From Mistakes Makes LLM Better Reasoner (2023)Shengnan An, Zexiong Ma, Zeqi Lin, et al.26.19
- The Unlocking Spell On Base Llms: Rethinking Alignment Via In-context Learning (2023)Bill Yuchen Lin, Abhilasha Ravichander, Ximing Lu, et al.26.03
- Mammoth: Building Math Generalist Models Through Hybrid Instruction Tuning (2023)Xiang Yue, Xingwei Qu, Ge Zhang, et al.25.94
- Direct Nash Optimization: Teaching Language Models To Self-improve With General Preferences (2024)Corby Rosset, Ching-An Cheng, Arindam Mitra, et al.25.73
- Fine-tuning Language Models For Factuality (2023)Katherine Tian, Eric Mitchell, Huaxiu Yao, et al.25.62
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation (2023)Tu Vu, Mohit Iyyer, Xuezhi Wang, et al.25.58
- Agent-flan: Designing Data And Methods Of Effective Agent Tuning For Large Language Models (2024)Zehui Chen, Kuikun Liu, Qiuchen Wang, et al.25.55
- When Scaling Meets LLM Finetuning: The Effect Of Data, Model And Finetuning Method (2024)Biao Zhang, Zhongtao Liu, Colin Cherry, et al.25.34
- Llamax: Scaling Linguistic Horizons Of LLM By Enhancing Translation Capabilities Beyond 100 Languages (2024)Yinquan Lu, Wenhao Zhu, Lei Li, et al.25.17
- Detoxifying Large Language Models Via Knowledge Editing (2024)Mengru Wang, Ningyu Zhang, Ziwen Xu, et al.25.02
- Toward Self-improvement Of Llms Via Imagination, Searching, And Criticizing (2024)Ye Tian, Baolin Peng, Linfeng Song, et al.24.80
- Continual Learning Of Large Language Models: A Comprehensive Survey (2024)Haizhou Shi, Zihao Xu, Hengyi Wang, et al.24.79
- Layoutllm: Layout Instruction Tuning With Large Language Models For Document Understanding (2024)Chuwei Luo, Yufan Shen, Zhaoqing Zhu, et al.24.66
- Self-play With Execution Feedback: Improving Instruction-following Capabilities Of Large Language Models (2024)Guanting Dong, Keming Lu, Chengpeng Li, et al.24.59
- Personalized Soups: Personalized Large Language Model Alignment Via Post-hoc Parameter Merging (2023)Joel Jang, Seungone Kim, Bill Yuchen Lin, et al.24.50
- Imagebind-llm: Multi-modality Instruction Tuning (2023)Jiaming Han, Renrui Zhang, Wenqi Shao, et al.24.37
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! (2023)Xiangyu Qi, Yi Zeng, Tinghao Xie, et al.24.31
- Federatedscope-llm: A Comprehensive Package For Fine-tuning Large Language Models In Federated Learning (2023)Weirui Kuang, Bingchen Qian, Zitao Li, et al.24.26
- Compact Language Models Via Pruning And Knowledge Distillation (2024)Saurav Muralidharan, Sharath Turuvekere Sreenivas, Raviraj Joshi, et al.24.15
- Platypus: Quick, Cheap, And Powerful Refinement Of Llms (2023)Ariel N. Lee, Cole J. Hunter, Nataniel Ruiz24.03
- Lm-cocktail: Resilient Tuning Of Language Models Via Model Merging (2023)Shitao Xiao, Zheng Liu, Peitian Zhang, et al.23.85
- Chatglm-math: Improving Math Problem-solving In Large Language Models With A Self-critique Pipeline (2024)Yifan Xu, Xiao Liu, Xinghan Liu, et al.23.84
- Llama-moe: Building Mixture-of-experts From Llama With Continual Pre-training (2024)Tong Zhu, Xiaoye Qu, Daize Dong, et al.23.81
- Nash Learning From Human Feedback (2023)RΓ©mi Munos, Michal Valko, Daniele Calandriello, et al.23.79
- Disc-lawllm: Fine-tuning Large Language Models For Intelligent Legal Services (2023)Shengbin Yue, Wei Chen, Siyuan Wang, et al.23.76
- Self-exploring Language Models: Active Preference Elicitation For Online Alignment (2024)Shenao Zhang, Donghan Yu, Hiteshi Sharma, et al.23.63
- Synthetic Data (almost) From Scratch: Generalized Instruction Tuning For Language Models (2024)Haoran Li, Qingxiu Dong, Zhengyang Tang, et al.23.56
- Large Language Model Unlearning Via Embedding-corrupted Prompts (2024)Chris Yuhao Liu, Yaxuan Wang, Jeffrey Flanigan, et al.23.55
- Instruction Tuning For Large Language Models: A Survey (2023)Shengyu Zhang, Linfeng Dong, Xiaoya Li, et al.23.46
- Two Tales Of Persona In Llms: A Survey Of Role-playing And Personalization (2024)Yu-Min Tseng, Yu-Chao Huang, Teng-Yun Hsiao, et al.23.45
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation (2023)Bowen Zheng, Yupeng Hou, Hongyu Lu, et al.23.45
- Table-gpt: Table-tuned GPT For Diverse Table Tasks (2023)Peng Li, Yeye He, Dror Yashar, et al.23.44
- MMC: Advancing Multimodal Chart Understanding With Large-scale Instruction Tuning (2023)Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, et al.23.28
- A Paradigm Shift In Machine Translation: Boosting Translation Performance Of Large Language Models (2023)Haoran Xu, Young Jin Kim, Amr Sharaf, et al.23.20
- Chipnemo: Domain-adapted Llms For Chip Design (2023)Mingjie Liu, Teodor-Dumitru Ene, Robert Kirby, et al.23.12
- Expel: LLM Agents Are Experiential Learners (2023)Andrew Zhao, Daniel Huang, Quentin Xu, et al.23.10
- Let The Expert Stick To His Last: Expert-specialized Fine-tuning For Sparse Architectural Large Language Models (2024)Zihan Wang, Deli Chen, Damai Dai, et al.23.04