ClassEval
Canonical12papers using it
2023first seen
ClassEval is a dataset that contains high-quality samples used to evaluate the performance of models on programming tasks, specifically in the context of low-resource general-purpose programming languages.
Papers using ClassEval (12)
- ClassEval-T: Evaluating Large Language Models in Class-Level Code
TranslationCangjieBench: Benchmarking LLMs on a Low-Resource General-Purpose Programming LanguageScaling Test-Driven Code Generation from Functions to Classes: An Empirical StudyAutomated Test Suite Enhancement Using Large Language Models with Few-shot PromptingFrom Human to Machine Refactoring: Assessing GPT-4's Impact on Python Class Quality and ReadabilityEvaluating Software Process Models for Multi-Agent Class-Level Code GenerationTALM: Dynamic Tree-Structured Multi-Agent Framework with Long-Term Memory for Scalable Code GenerationClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on
Class-level Code GenerationReasoning Runtime Behavior of a Program with LLM: How Far Are We?CoCoST: Automatic Complex Code Generation with Online Searching and
Correctness TestingTaskEval: Assessing Difficulty of Code Generation Tasks for Large Language ModelsStrategic Optimization and Challenges of Large Language Models in
Object-Oriented Programming