← all datasets

The Stack v-2

Emerging

4papers using it

2024first seen

🔎 Find this dataset

Papers using The Stack v-2 (4)

Cracks in The Stack: Hidden Vulnerabilities and Licensing Risks in LLM Pre-Training Datasets2025 · 4 cites

When to Ponder: Adaptive Compute Allocation for Code Generation via Test-Time Training2026

StarCoder 2 and The Stack v2: The Next Generation2024 · 59 cites

Enhancing Cross-Language Code Translation via Task-Specific Embedding Alignment in Retrieval-Augmented Generation2024