Math & ReasoningCommercial OK

NuminaMath CoT

by AI-MO

Silver57
30.5Kdownloads
552likes

Description

Dataset Card for NuminaMath CoT Dataset Summary Approximately 860k math problems, where each solution is formatted in a Chain of Thought (CoT) manner. The sources of the dataset range from Chinese high school math exercises to US and international mathematics olympiad competition problems. The data were primarily collected from online exam paper PDFs and mathematics discussion forums. The processing steps include (a) OCR from the original PDFs, (b) segmentation into… See the full description on the dataset page: https://huggingface.co/datasets/AI-MO/NuminaMath-CoT.

What can I do with this?

Tags

task_categories:text-generationlanguage:enlicense:apache-2.0size_categories:100K<n<1Mformat:parquetmodality:textlibrary:datasetslibrary:dasklibrary:polarslibrary:mlcroissantregion:usaimomath