LiveCodeBench
Emerging19papers using it
2025first seen
Papers using LiveCodeBench (19)
- Kimi k1.5: Scaling Reinforcement Learning with LLMsThinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient ReasonersRing-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMsApriel-1.5-OpenReasoner: RL Post-Training for General-Purpose and Efficient ReasoningReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement LearningSAGE: Multi-Agent Self-Evolution for LLM ReasoningLLMs Can Learn to Reason Via Off-Policy RLReVeal: Self-Evolving Code Agents via Reliable Self-VerificationAceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement LearningSRPO: A Cross-Domain Implementation of Large-Scale Reinforcement
Learning on LLMACECODER: Acing Coder RL via Automated Test-Case SynthesisProcess Reward Models That ThinkSkywork Open Reasoner 1 Technical ReportLearning to Orchestrate Agents in Natural Language with the ConductorCLEANER: Self-Purified Trajectories Boost Agentic Reinforcement LearningFunPRM: Function-as-Step Process Reward Model with Meta Reward Correction for Code GenerationBridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code GenerationAgnostics: Learning to Code in Any Programming Language via Reinforcement with a Universal Learning EnvironmentSample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning