HELMET

Emerging

3papers using it

828HF downloads

9HF likes

2025first seen

HELMET: How to Evaluate Long-context Language Models Effectively and Thoroughly [Paper][Code] HELMET is a comprehensive benchmark for long-context language models covering seven diverse categories of tasks. The datasets are application-centric and are designed to evaluate models at different lengths and levels of compl

🤗 Hugging Face

Papers using HELMET (3)

RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval2026

LongMagpie: A Self-synthesis Method for Generating Large-scale Long-context Instructions2025

NExtLong: Toward Effective Long-Context Training without Long Documents2025