← all datasets

Defects4J

Canonical

84papers using it

2018first seen

Defects4J is a benchmark dataset that contains a collection of real-world Java bugs used to evaluate fault localization techniques.

🔎 Find this dataset

Papers using Defects4J (83)

Are Large Language Models Memorizing Bug Benchmarks?2024 · 4 cites

Multi-task LLMs for Bug Classification: Efficient Inference with Auxiliary Decoding Heads2026

LLM-based Mockless Unit Test Generation for Java2026

The Impact of Fine-tuning Large Language Models on Automated Program Repair2025 · 1 cites

HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale2024 · 1 cites

Runtime Execution Traces Guided Automated Program Repair with Multi-Agent Debate2026

Project Prometheus: Bridging the Intent Gap in Agentic Program Repair via Reverse-Engineered Executable Specifications2026

DebugRepair: Enhancing LLM-Based Automated Program Repair via Self-Directed Debugging2026

Agentic Code Reasoning2026

Boosting LLMs for Mutation Generation2026

Specification Vibing for Automated Program Repair2026

HAFixAgent: History-Aware Program Repair Agent2025

Enhancing LLM-based Fault Localization with a Functionality-Aware Retrieval-Augmented Generation Framework2025

BloomAPR: A Bloom's Taxonomy-based Framework for Assessing the Capabilities of LLM-Powered APR Solutions2025

Reinforcement Learning-Guided Chain-of-Draft for Token-Efficient Code Generation2025

Breaking the Myth: Can Small Models Infer Postconditions Too?2025

Improving LLM-Based Fault Localization with External Memory and Project Context2025

HAFixAgent: History-Aware Automated Program Repair Agent2025

The Art of Repair: Optimizing Iterative Program Repair with Instruction-Tuned Models2025

Assessing the Impact of Code Changes on the Fault Localizability of Large Language Models2025

Evaluating the Generalizability of LLMs in Automated Program Repair2025

LLMs are Bug Replicators: An Empirical Study on LLMs' Capability in Completing Bug-prone Code2025

Studying and Understanding the Effectiveness and Failures of Conversational LLM-Based Repair2025

Where's the Bug? Attention Probing for Scalable Fault Localization2025

HAFix: History-Augmented Large Language Models for Bug Fixing2025

FlexFL: Flexible and Effective Fault Localization with Open-Source Large Language Models2024

Exploring and Lifting the Robustness of LLM-powered Automated Program Repair with Metamorphic Testing2024

ContrastRepair: Enhancing Conversation-Based Automated Program Repair via Contrastive Test Case Pairs2024

Boosting Redundancy-based Automated Program Repair by Fine-grained Pattern Mining2023

SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair2019 · 392 cites

TBar: Revisiting Template-based Automated Program Repair2019 · 287 cites

CURE: Code-Aware Neural Machine Translation for Automatic Program Repair2021 · 263 cites

Less Training, More Repairing Please: Revisiting Automated Program Repair via Zero-shot Learning2022 · 187 cites

iFixR: Bug Report driven Program Repair2019 · 83 cites

SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics2022 · 75 cites

A Deep Dive into Large Language Models for Automated Bug Localization and Repair2024 · 57 cites

Alleviating Patch Overfitting with Automatic Test Generation: A Study of Feasibility and Effectiveness for the Nopol Repair System2018 · 54 cites

Extracting Concise Bug-Fixing Patches from Human-Written Patches in Version Control Systems2021 · 38 cites

ThinkRepair: Self-Directed Automated Program Repair2024 · 35 cites

How Different Is It Between Machine-Generated and Developer-Provided Patches? An Empirical Study on The Correct Patches Generated by Automated Program Repair Techniques2019 · 32 cites

Large Language Models in Fault Localisation2023 · 19 cites

RepairAgent: An Autonomous, LLM-Based Agent for Program Repair2024 · 14 cites

DEAR: A Novel Deep Learning-based Approach for Automated Program Repair2022 · 11 cites

ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Automatic Program Repair2019 · 10 cites

Domain Adaptation for Code Model-based Unit Test Case Generation2023 · 9 cites

Unit Test Case Generation with Transformers and Focal Context2020 · 8 cites

CigaR: Cost-efficient Program Repair with LLMs2024 · 8 cites

Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction2022 · 7 cites

Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions2023 · 7 cites

APPT: Boosting Automated Patch Correctness Prediction via Fine-tuning Pre-trained Models2023 · 5 cites

Large Language Models for Test-Free Fault Localization2023 · 5 cites

UniDebugger: Hierarchical Multi-Agent Framework for Unified Software Debugging2024 · 5 cites

Revisiting ssFix for Better Program Repair2019 · 4 cites

MCRepair: Multi-Chunk Program Repair via Patch Optimization with Buggy Block2023 · 4 cites

Practical Program Repair via Bytecode Mutation2018 · 2 cites

Harnessing Evolution for Multi-Hunk Program Repair2019 · 2 cites

Can Automated Program Repair Refine Fault Localization?2019 · 2 cites

Can LLMs Demystify Bug Reports?2023 · 2 cites

Enriching Automatic Test Case Generation by Extracting Relevant Test Inputs from Bug Reports2023 · 2 cites

RESTORE: Retrospective Fault Localization Enhancing Automated Program Repair2019 · 1 cites

Better Automatic Program Repair by Using Bug Reports and Tests Together2020 · 1 cites

Adversarial Patch Generation for Automated Program Repair2020 · 1 cites

Evaluating Diverse Large Language Models for Automatic and General Bug Reproduction2023 · 1 cites

BugsInPy: A Database of Existing Bugs in Python Programs to Enable Controlled Testing and Debugging Studies2024 · 1 cites

RepairBench: Leaderboard of Frontier Models for Program Repair2024 · 1 cites

HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale2024 · 1 cites

Attention Please: Consider Mockito when Evaluating Newly Proposed Automated Program Repair Techniques2018

AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations2018

Automated Classification of Overfitting Patches with Statically Extracted Code Features2019

Elixir: Effective object-oriented program repair2021

A Quick Repair Facility for Debugging2022

Reinforcement Learning for Mutation Operator Selection in Automated Program Repair2023

The GitHub Recent Bugs Dataset for Evaluating LLM-based Debugging Applications2023

GitBug-Actions: Building Reproducible Bug-Fix Benchmarks with GitHub Actions2023

Aligning the Objective of LLM-based Program Repair2024

On The Effectiveness of Dynamic Reduction Techniques in Automated Program Repair2024

Impact of Large Language Models of Code on Fault Localization2024

Revisiting Evolutionary Program Repair via Code Language Model2024

Memory-Efficient Large Language Models for Program Repair with Semantic-Guided Patch Generation2024

Software Fault Localization Based on Multi-objective Feature Fusion and Deep Learning2024

What You See Is What You Get: Attention-based Self-guided Automatic Unit Test Generation2024

Using Defect Prediction to Improve the Bug Detection Capability of Search-Based Software Testing2022

Neural-Based Test Oracle Generation: A Large-scale Evaluation and Lessons Learned2023