Instruction FollowingSFTUnknown
COIG-CQIA
by m-a-p
8.8Kdownloads
706likes
10K<n<100KDescription
COIG-CQIA:Quality is All you need for Chinese Instruction Fine-tuning
Dataset Details
Dataset Description
欢迎来到COIG-CQIA,COIG-CQIA全称为Chinese Open Instruction Generalist - Quality is All You Need, 是一个开源的高质量指令微调数据集,旨在为中文NLP社区提供高质量且符合人类交互行为的指令微调数据。COIG-CQIA以中文互联网获取到的问答及文章作为原始数据,经过深度清洗、重构及人工审核构建而成。本项目受LIMA: Less Is More for Alignment等研究启发,使用少量高质量的数据即可让大语言模型学习到人类交互行为,因此在数据构建中我们十分注重数据的来源、质量与多样性,数据集详情请见数据介绍以及我们接下来的论文。
Welcome to the COIG-CQIA… See the full description on the dataset page: https://huggingface.co/datasets/m-a-p/COIG-CQIA.
What can I do with this?
Tags
task_categories:question-answeringtask_categories:text-classificationtask_categories:text-generationlanguage:zhsize_categories:10K<n<100Kformat:jsonmodality:textlibrary:datasetslibrary:dasklibrary:mlcroissantlibrary:polarsarxiv:2403.18058arxiv:2304.07987arxiv:2307.09705region:us