Awesome LLM Security
LLM Security is one of the most active areas in Awesome Cybersecurity β 721 papers in this collection, evaluated on datasets like CICIDS2017, NSL-KDD, UNSW-NB15. A strong starting point is "Adversarial Machine Learning Attacks And Defense Methods In The Cyber Security Domain".
Datasets & benchmarks
Key papers
- Adversarial Machine Learning Attacks And Defense Methods In The Cyber Security Domain (2020)Ihai Rosenberg, Asaf Shabtai, Yuval Elovici, et al.17.66
- Injecagent: Benchmarking Indirect Prompt Injections In Tool-integrated Large Language Model Agents (2024)Qiusi Zhan, Zhixiang Liang, Zifan Ying, et al.15.95
- Cyberlearning: Effectiveness Analysis Of Machine Learning Security Modeling To Detect Cyber-anomalies And Multi-attacks (2021)Iqbal H. Sarker15.10
- A Survey On Explainable Artificial Intelligence For Cybersecurity (2023)Gaith Rjoub, Jamal Bentahar, Omar Abdel Wahab, et al.14.90
- Exploiting Programmatic Behavior Of Llms: Dual-use Through Standard Security Attacks (2023)Daniel Kang, Xuechen Li, Ion Stoica, et al.13.70
- Security Vulnerability Detection Using Deep Learning Natural Language Processing (2021)Noah Ziems, Shaoen Wu13.55
- Generative AI In Cybersecurity: A Comprehensive Review Of LLM Applications And Vulnerabilities (2024)Mohamed Amine Ferrag, Fatima Alwahedi, Ammar Battah, et al.13.17
- Large Language Models In Cybersecurity: State-of-the-art (2024)Farzad Nourmohammadzadeh Motlagh, Mehrdad Hajizadeh, Mehryar Majd, et al.11.85
- Prompt As Triggers For Backdoor Attack: Examining The Vulnerability In Language Models (2023)Shuai Zhao, Jinming Wen, Luu Anh Tuan, et al.11.85
- CIPHER: Cybersecurity Intelligent Penetration-testing Helper For Ethical Researcher (2024)Derry Pratama, Naufal Suryanto, Andro Aprila Adiputra, et al.11.78
- Learn To Adapt: Robust Drift Detection In Security Domain (2022)Aditya Kuppa, Nhien-An Le-Khac11.58
- Llms In Software Security: A Survey Of Vulnerability Detection Techniques And Insights (2025)Ze Sheng, Zhicheng Chen, Shuning Gu, et al.11.54
- Exploring The Dark Side Of AI: Advanced Phishing Attack Design And Deployment Using Chatgpt (2023)Nils Begou, Jeremy Vinoy, Andrzej Duda, et al.11.29
- Deep Reinforcement Learning For Cybersecurity Threat Detection And Protection: A Review (2022)Mohit Sewak, Sanjay K. Sahay, Hemant Rathore11.29
- Poisonprompt: Backdoor Attack On Prompt-based Large Language Models (2023)Hongwei Yao, Jian Lou, Zhan Qin11.19
- Fedsecurity: Benchmarking Attacks And Defenses In Federated Learning And Federated Llms (2023)Shanshan Han, Baturalp Buyukates, Zijian Hu, et al.10.85
- Enhancing Threat Detection Using Artificial Intelligence in Modern Cybersecurity Systems Using SPSS Statistics (2026)Rajendar Dommeti10.82
- Towards Explainable Network Intrusion Detection Using Large Language Models (2024)Paul R. B. Houssel, Priyanka Singh, Siamak Layeghy, et al.10.74
- DOLOS: A Novel Architecture For Moving Target Defense (2023)Giulio Pagnotta, Fabio de Gaspari, Dorjan Hitaj, et al.10.74
- Machine Learning (in) Security: A Stream Of Problems (2020)FabrΓcio Ceschin, Marcus Botacin, Albert Bifet, et al.10.61
- Ctibench: A Benchmark For Evaluating Llms In Cyber Threat Intelligence (2024)Md Tanvirul Alam, Dipkamal Bhusal, Le Nguyen, et al.10.48
- Living-off-the-land Command Detection Using Active Learning (2021)Talha Ongun, Jack W. Stokes, Jonathan Bar Or, et al.10.48
- Next-generation Phishing: How LLM Agents Empower Cyber Attackers (2024)Khalifa Afane, Wenqi Wei, Ying Mao, et al.10.35
- Logpr\'ecis: Unleashing Language Models For Automated Malicious Log Analysis (2023)Matteo Boffa, Rodolfo Vieira Valentim, Luca Vassio, et al.10.21
- On The (in)security Of Peer-to-peer Decentralized Machine Learning (2022)Dario Pasquini, Mathilde Raynal, Carmela Troncoso10.07
- Tackling Imbalanced Data In Cybersecurity With Transfer Learning: A Case With ROP Payload Detection (2021)Haizhou Wang, Peng Liu9.92
- Enhancing 6G-IoT Network Security: A Trustworthy and Responsible AI-Driven Stacked-Hybrid Model for Attack Detection (2026)Anshika Sharma et al.9.81
- Illuminati: Towards Explaining Graph Neural Networks For Cybersecurity Analysis (2023)Haoyu He, Yuede Ji, H. Howie Huang9.41
- With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks And Defenses For Linear Regression Models (2020)Jialin Wen, Benjamin Zi Hao Zhao, Minhui Xue, et al.9.23
- Modsec-learn: Boosting Modsecurity With Machine Learning (2024)Christian Scano, Giuseppe Floris, Biagio Montaruli, et al.9.16
- NYU CTF Bench: A Scalable Open-source Benchmark Dataset For Evaluating Llms In Offensive Security (2024)Minghao Shao, Sofija Jancheska, Meet Udeshi, et al.8.85
- Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, And Improvements (2024)Isamu Isozaki, Manil Shrestha, Rick Console, et al.8.82
- Intrusion Detection System Using Deep Learning For Network Security (2025)Soham Chatterjee, Satvik Chaudhary, Aswani Kumar Cherukuri8.65
- A Survey Of Source Code Representations For Machine Learning-based Cybersecurity Tasks (2024)Beatrice Casey, Joanna C. S. Santos, George Perry8.60
- Membership Inference Attacks Against In-context Learning (2024)Rui Wen, Zheng Li, Michael Backes, et al.8.35
- APT-LLM: Embedding-based Anomaly Detection Of Cyber Advanced Persistent Threats Using Large Language Models (2025)Sidahmed Benabderrahmane, Petko Valtchev, James Cheney, et al.8.23
- On Security And Sparsity Of Linear Classifiers For Adversarial Settings (2017)Ambra Demontis, Paolo Russu, Battista Biggio, et al.7.81
- Large Language Model (LLM) For Software Security: Code Analysis, Malware Analysis, Reverse Engineering (2025)Hamed Jelodar, Samita Bai, Parisa Hamedi, et al.7.24
- Autonomous Threat Detection And Response In Cloud Security: A Comprehensive Survey Of Ai-driven Strategies (2026)Gaurav Sarraf, Vibhor Pal7.24
- Feasibility Study For Supporting Static Malware Analysis Using LLM (2024)Shota Fujii, Rei Yamagishi7.16
- Fine-tuned Large Language Models (llms): Improved Prompt Injection Attacks Detection (2024)Md Abdur Rahman, Fan Wu, Alfredo Cuzzocrea, et al.7.16
- Leveraging Large Language Models To Detect Npm Malicious Packages (2024)Nusrat Zahan, Philipp Burckhardt, Mikola Lysenko, et al.7.16
- Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks On Pre-trained Model Hubs (2024)Jian Zhao, Shenao Wang, Yanjie Zhao, et al.7.16
- A Comparative Analysis Of Dnn-based White-box Explainable AI Methods In Network Security (2025)Osvaldo Arreche, Mustafa Abdallah7.06
- Security Concerns For Large Language Models: A Survey (2025)Miles Q. Li, Benjamin C. M. Fung7.06
- Research On Enhancing Cloud Computing Network Security Using Artificial Intelligence Algorithms (2025)Yuqing Wang, Xiao Yang7.06
- A Dynamic-adversarial Mining Approach To The Security Of Machine Learning (2018)Tegjyot Singh Sethi, Mehmed Kantardzic, Lingyu Lyua, et al.6.77
- Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks (2026)John T. Halloran et al.6.69
- Genxss: An Ai-driven Framework For Automated Detection Of XSS Attacks In Wafs (2025)Vahid Babaey, Arun Ravindran6.64
- Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks On LLM Agents (2025)Qiusi Zhan, Richard Fang, Henil Shalin Panchal, et al.6.39
- Autoredteamer: Autonomous Red Teaming With Lifelong Attack Integration (2025)Andy Zhou, Kevin Wu, Francesco Pinto, et al.6.39
- Vulscriber: Exploring Rag-based Vulnerability Augmentation With Llms (2024)Seyed Shayan Daneshvar, Yu Nong, Xu Yang, et al.6.34
- APOLLO: A Gpt-based Tool To Detect Phishing Emails And Generate Explanations That Warn Users (2024)Giuseppe Desolda, Francesco Greco, Luca ViganΓ²6.34
- Psyborg+: Modeling And Simulation For Detecting Cognitive Biases In Advanced Persistent Threats (2024)Shuo Huang, Fred Jones, Nikolos Gurney, et al.6.34
- Spearbot: Leveraging Large Language Models In A Generative-critique Framework For Spear-phishing Email Generation (2024)Qinglin Qi, Yun Luo, Yijia Xu, et al.6.34
- Harmbench: A Standardized Evaluation Framework For Automated Red Teaming And Robust Refusal (2024)Mantas Mazeika, Long Phan, Xuwang Yin, et al.5.95
- Secalign: Defending Against Prompt Injection With Preference Optimization (2024)Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, et al.5.84
- UOR: Universal Backdoor Attacks On Pre-trained Language Models (2023)Wei Du, Peixuan Li, Boqun Li, et al.5.84
- Community Targeted Phishing: A Middle Ground Between Massive And Spear Phishing Through Natural Language Generation (2017)Alberto Giaretta, Nicola Dragoni5.84
- Automatically Generating Rules Of Malicious Software Packages Via Large Language Model (2025)Xiangrui Zhang, Haoyu Chen, Yongzhong He, et al.5.82