Qwen-3-1.7B

Emerging

3papers using it

2025first seen

'Qwen-3-1.7B' is a benchmark used to evaluate the performance of models on reasoning tasks, specifically assessing their ability to optimize response quality and robustness through the application of the ADPO method.

🔎 Find this dataset

Papers using Qwen-3-1.7B (3)

ARCA: Adapter-Residual Credit Assignment When Token Signals Degenerate2026

ADPO: Anchored Direct Preference Optimization2025

RePO: Replay-Enhanced Policy Optimization2025