CodeCommercial OK

FLAN

by Open-Orca

Silver54
24.8Kdownloads
188likes
1B<n<10B

Description

๐Ÿฎ The WHOLE FLAN Collection! ๐Ÿฎ Overview This repository includes the full dataset from the FLAN Collection, totalling ~300GB as parquets. Generated using the official seqio templating from the Google FLAN Collection GitHub repo. The data is subject to all the same licensing of the component datasets. To keep up with our continued work on OpenOrca and other exciting research, find our Discord here: https://AlignmentLab.ai Motivation This work was done as part ofโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/Open-Orca/FLAN.

What can I do with this?

Tags

language:enlicense:cc-by-4.0size_categories:100M<n<1Bformat:parquetmodality:textlibrary:datasetslibrary:dasklibrary:mlcroissantlibrary:polarsarxiv:2301.13688arxiv:2109.01652arxiv:2110.08207arxiv:2204.07705region:us