Instruction FollowingPPO, Reward ModelingCommercial OK
HelpSteer2
by nvidia
3.5Kdownloads
442likes
10K<n<100KDescription
HelpSteer2: Open-source dataset for training top-performing reward models
HelpSteer2 is an open-source Helpfulness Dataset (CC-BY-4.0) that supports aligning models to become more helpful, factually correct and coherent, while being adjustable in terms of the complexity and verbosity of its responses.
This dataset has been created in partnership with Scale AI.
When used to tune a Llama 3.1 70B Instruct Model, we achieve 94.1% on RewardBench, which makes it the best Reward Model as… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/HelpSteer2.
What can I do with this?
Tags
language:enlicense:cc-by-4.0size_categories:10K<n<100Kformat:jsonmodality:tabularmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsarxiv:2410.01257arxiv:2406.08673region:ushuman-feedback