LongBench
Emerging31papers using it
69,173HF downloads
181HF likes
2024first seen
LongBench is a comprehensive benchmark for multilingual and multi-task purposes, with the goal to fully measure and evaluate the ability of pre-trained language models to understand long text. This dataset consists of twenty different tasks, covering key long-text application scenarios such as multi-document QA, single
Papers using LongBench (29)
- IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM InferenceAttentionRAG: Attention-Guided Context Pruning in Retrieval-Augmented GenerationActivation-aware Probe-Query: Effective Key-Value Retrieval for
Long-Context LLMs InferenceProxyKV: Cross-Model Proxy Pruning for Efficient Long-Context LLM InferenceART: Attention Run-time Termination for Efficient Large Language Model DecodingIceCache: Memory-efficient KV-cache Management for Long-Sequence LLMsLycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse DecodingReinforced Fast Weights with Next-Sequence PredictionFederation of Experts: Communication Efficient Distributed Inference for Large Language ModelsEndPrompt: Efficient Long-Context Extension via Terminal AnchoringM-RAG: Making RAG Faster, Stronger, and More EfficientDeveloping Adaptive Context Compression Techniques for Large Language Models (LLMs) in Long-Running InteractionsAllMem: A Memory-centric Recipe for Efficient Long-context ModelingTowards robust long-context understanding of large language model via active recap learningPagedEviction: Structured Block-wise KV Cache Pruning for Efficient Large Language Model InferenceDSPC: Dual-Stage Progressive Compression Framework for Efficient Long-Context ReasoningChunkKV: Semantic-Preserving KV Cache Compression for Efficient
Long-Context LLM InferenceMDocAgent: A Multi-Modal Multi-Agent Framework for Document
UnderstandingOverflow Prevention Enhances Long-Context Recurrent LLMsBeyond Homogeneous Attention: Memory-Efficient LLMs via
Fourier-Approximated KV CacheLoong: Synthesize Long Chain-of-Thoughts at Scale through VerifiersMacRAG: Compress, Slice, and Scale-up for Multi-Scale Adaptive Context RAGMiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language ModelsAn Empirical Study on Prompt Compression for Large Language ModelsPromptDistill: Query-based Selective Token Retention in Intermediate
Layers for Efficient Large Language Model InferenceCriticalKV: Optimizing KV Cache Eviction from an Output Perturbation PerspectiveDoes RAG Really Perform Bad For Long-Context Processing?Task-agnostic Prompt Compression with Context-aware Sentence Embedding
and Reward-guided Task DescriptorExtending Context Window of Large Language Models from a Distributional
Perspective