← all datasets

DeepSeek-R-1

Emerging
7papers using it
2025first seen

The 'Deepseek R1' dataset/benchmark is used to evaluate the efficiency and effectiveness of various attention mechanisms, including Multi-head Latent Attention (MLA) and Group Query Attention (GQA), in large language models.

Papers using DeepSeek-R-1 (7)

DeepSeek-R-1 β€” datasets β€” llm-papers