Instruction FollowingSFT, Synthetic DataCopyleft

databricks-dolly-15k

by databricks

Silver58
30.6Kdownloads
939likes
10K<n<100K

Description

Summary databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several of the behavioral categories outlined in the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization. This dataset can be used for any purpose, whether academic or commercial, under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported… See the full description on the dataset page: https://huggingface.co/datasets/databricks/databricks-dolly-15k.

What can I do with this?

Tags

task_categories:question-answeringtask_categories:summarizationlanguage:enlicense:cc-by-sa-3.0size_categories:10K<n<100Kformat:jsonmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsarxiv:2203.02155region:us