BigGen Bench

Name: BigGen Bench
License: cc-by-sa-4.0

Emerging

3papers using it

154HF downloads

17HF likes

2025first seen

BIGGEN-Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models Dataset Description BIGGEN-Bench (BiG Generation Benchmark) is a comprehensive evaluation benchmark designed to assess the capabilities of large language models (LLMs) across a wide range of tasks. This benchmark focuses on free-form te

🤗 Hugging Face⚖ cc-by-sa-4.0

Papers using BigGen Bench (3)

An Empirical Study of LLM-as-a-Judge: How Design Choices Impact Evaluation Reliability2025 · 2 cites

Bridging Human and LLM Judgments: Understanding and Narrowing the Gap2025

INFERENCEDYNAMICS: Efficient Routing Across LLMs through Structured Capability and Knowledge Profiling2025