← all datasets

ChemCoTBench-V-2

Emerging
1papers using it
2026first seen

ChemCoTBench-V-2 is a rule-verifiable diagnostic benchmark containing 5,620 evaluation samples across 18 tasks, used to assess structured chemical reasoning in large language models by requiring them to provide and verify intermediate reasoning steps.

Papers using ChemCoTBench-V-2 (1)

ChemCoTBench-V-2 β€” datasets β€” recommender-systems