Math & ReasoningORPO, PretrainingNon-Commercial
Natural Reasoning
by facebook
1.5Kdownloads
555likes
1M<n<10MDescription
NaturalReasoning is a large-scale dataset for general reasoning tasks. It consists of high-quality challenging reasoning questions backtranslated from pretraining corpora DCLM and FineMath. The questions have been deduplicated and decontaminated from popular reasoning benchmarks including MATH, GPQA, MMLU-Pro, MMLU-STEM. For each question, we extract the reference final answer from the original document from the pretraining corpora if possible. We also provide a model-generated response from… See the full description on the dataset page: https://huggingface.co/datasets/facebook/natural_reasoning.
What can I do with this?
Tags
task_categories:text-generationlanguage:enlicense:cc-by-nc-4.0size_categories:1M<n<10Mformat:jsonmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsarxiv:2502.13124region:us