DebugBench

Emerging

3papers using it

2024first seen

Dataset Summary DebugBench is a Large Language Model (LLM) debugging benchmark introduced in the paper DebugBench: Evaluating Debugging Capability of Large Language Models. We collect code snippets from the LeetCode community and implant bugs into source data with GPT-4. The project is also open-sourced as a GitHub rep

🔎 Find this dataset

Papers using DebugBench (3)

MdEval: Massively Multilingual Code Debugging2024 · 1 cites

DebugBench: Evaluating Debugging Capability of Large Language Models2024 · 34 cites

Debugging with Open-Source Large Language Models: An Evaluation2024 · 8 cites