ChemCoTBench-V-2
Emerging1papers using it
2026first seen
ChemCoTBench-V-2 is a rule-verifiable diagnostic benchmark containing 5,620 evaluation samples across 18 tasks, used to assess structured chemical reasoning in large language models by requiring them to provide and verify intermediate reasoning steps.