← all datasets

Chatbot Arena

Canonical

8papers using it

2024first seen

🔎 Find this dataset

Papers using Chatbot Arena (8)

SCOPE: Selective Conformal Optimized Pairwise LLM Judging2026

Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings2025

Bridging Human and LLM Judgments: Understanding and Narrowing the Gap2025

SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs2025

SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs2025

Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models2025

Investigating Non-Transitivity in LLM-as-a-Judge2025

A Statistical Framework for Ranking LLM-Based Chatbots2024

Chatbot Arena — datasets — llm-papers