curated dataset of 4,070 packages

Emerging

1papers using it

2026first seen

The curated dataset of 4,070 packages contains 3,700 benign and 370 malicious software packages, and it is used to evaluate the performance of Large Language Models (LLMs) in detecting malicious packages and identifying specific malicious indicators.

🔎 Find this dataset

Papers using curated dataset of 4,070 packages (1)

Mind the Gap: Evaluating LLMs for High-Level Malicious Package Detection vs. Fine-Grained Indicator Identification2026