← all papers · overview

Thinking Longer, Not Larger: Enhancing Software Engineering Agents Via Scaling Test-time Compute

·2025

Abstract

Recent advancements in software engineering agents have demonstrated promising capabilities in automating program improvements. However, their reliance on closed-source or resource-intensive models introduces significant deployment challenges in private environments, prompting a critical question: \textit\{How can personally deployable open-source LLMs achieve comparable code reasoning performance?\} To this end, we propose a unified Test-Time Compute scaling framework that leverages increased inference-time computation instead of larger models. Our framework incorporates two complementary strategies: internal TTC and external TTC. Internally, we introduce a \textit\{development-contextualized trajectory synthesis\} method leveraging real-world software repositories to bootstrap multi-stage reasoning processes, such as fault localization and patch generation. We further enhance trajectory quality through rejection sampling, rigorously evaluating trajectories along accuracy and comple

Related papers

Ranked by semantic similarity — how closely each paper's abstract matches this one (100% = near-identical topic).