CodeSynthetic DataUnknown
OpenThoughts-1k-sample
by ryanmarten
425.9Kdownloads
2likes
Description
[!NOTE]
We have released a paper for OpenThoughts! See our paper here.
Open-Thoughts-1k-sample
This is a 1k sample of the OpenThoughts-114k dataset.
Open synthetic reasoning dataset with high-quality examples covering math, science, code, and puzzles!
Inspect the content with rich formatting with Curator Viewer.
Available Subsets
default subset containing ready-to-train data used to finetune the OpenThinker-7B and OpenThinker-32B models:
ds =… See the full description on the dataset page: https://huggingface.co/datasets/ryanmarten/OpenThoughts-1k-sample.
What can I do with this?
Tags
size_categories:1K<n<10Kformat:parquetmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsarxiv:2506.04178region:us