AMC
Emerging7papers using it
15HF downloads
0HF likes
2025first seen
Papers using AMC (7)
- Distribution-Aware Reward Estimation for Test-Time Reinforcement LearningTransformation-Augmented GRPO for Enhancing Exploration in Reasoning of Large Language ModelsEfficient Reinforcement Finetuning via Adaptive Curriculum LearningSEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy
OptimizationLIMOPro: Reasoning Refinement for Efficient and Effective Test-time
ScalingInpainting-Guided Policy Optimization for Diffusion Large Language
ModelsCan LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning