DAPO
Emerging2papers using it
2026first seen
DAPO++ is a newly curated Reinforcement Learning from Verifiable Rewards (RLVR) dataset designed to evaluate dataset quality and performance by providing a decontaminated training dataset with concentrated learning signals.