Math & ReasoningCommercial OK
NuminaMath CoT
by AI-MO
30.5Kdownloads
552likes
Description
Dataset Card for NuminaMath CoT
Dataset Summary
Approximately 860k math problems, where each solution is formatted in a Chain of Thought (CoT) manner. The sources of the dataset range from Chinese high school math exercises to US and international mathematics olympiad competition problems. The data were primarily collected from online exam paper PDFs and mathematics discussion forums. The processing steps include (a) OCR from the original PDFs, (b) segmentation into… See the full description on the dataset page: https://huggingface.co/datasets/AI-MO/NuminaMath-CoT.
What can I do with this?
Tags
task_categories:text-generationlanguage:enlicense:apache-2.0size_categories:100K<n<1Mformat:parquetmodality:textlibrary:datasetslibrary:dasklibrary:polarslibrary:mlcroissantregion:usaimomath