Preference & Alignment (DPO/RLHF)Preference LearningOther

chatbot_arena_conversations

by lmsys

Silver53
7.9Kdownloads
449likes
10K<n<100K

Description

Chatbot Arena Conversations Dataset This dataset contains 33K cleaned conversations with pairwise human preferences. It is collected from 13K unique IP addresses on the Chatbot Arena from April to June 2023. Each sample includes a question ID, two model names, their full conversation text in OpenAI API JSON format, the user vote, the anonymized user ID, the detected language tag, the OpenAI moderation API tag, the additional toxic tag, and the timestamp. To ensure the safe release… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/chatbot_arena_conversations.

What can I do with this?

Tags

license:ccsize_categories:10K<n<100Kformat:parquetmodality:tabularmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsarxiv:2306.05685region:us