Instruction FollowingSFT, Synthetic DataCopyleft
databricks-dolly-15k
by databricks
30.6Kdownloads
939likes
10K<n<100KDescription
Summary
databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several
of the behavioral categories outlined in the InstructGPT paper, including brainstorming, classification,
closed QA, generation, information extraction, open QA, and summarization.
This dataset can be used for any purpose, whether academic or commercial, under the terms of the
Creative Commons Attribution-ShareAlike 3.0 Unported… See the full description on the dataset page: https://huggingface.co/datasets/databricks/databricks-dolly-15k.
What can I do with this?
Tags
task_categories:question-answeringtask_categories:summarizationlanguage:enlicense:cc-by-sa-3.0size_categories:10K<n<100Kformat:jsonmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsarxiv:2203.02155region:us