Minerva
Emerging6papers using it
2025first seen
Papers using Minerva (6)
- Transformation-Augmented GRPO for Enhancing Exploration in Reasoning of Large Language ModelsLong Chain-of-Thought Compression via Fine-Grained Group Policy OptimizationPrompting Test-Time Scaling Is A Strong LLM Reasoning Data AugmentationSEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy
OptimizationSocratic-Zero : Bootstrapping Reasoning via Data-Free Agent Co-evolutionWalk Before You Run! Concise LLM Reasoning via Reinforcement Learning