Instruction FollowingSynthetic DataCommercial OK

OpenR1-Math-220k

by open-r1

Silver56
14.2Kdownloads
719likes

Description

OpenR1-Math-220k Dataset description OpenR1-Math-220k is a large-scale dataset for mathematical reasoning. It consists of 220k math problems with two to four reasoning traces generated by DeepSeek R1 for problems from NuminaMath 1.5. The traces were verified using Math Verify for most samples and Llama-3.3-70B-Instruct as a judge for 12% of the samples, and each problem contains at least one reasoning trace with a correct answer. The dataset consists of two splits:… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/OpenR1-Math-220k.

What can I do with this?

Tags

language:enlicense:apache-2.0size_categories:100K<n<1Mformat:parquetmodality:textlibrary:datasetslibrary:dasklibrary:polarslibrary:mlcroissantregion:us