Preference & Alignment (DPO/RLHF)Preference LearningCommercial OK
vision-arena-bench-v0.1
by lmarena-ai
1.7Kdownloads
3likes
100<n<1KDescription
VisionArena-Bench: An automatic eval pipeline to estimate model preference rankings
An automatic benchmark of 500 diverse user prompts that can be used to cheaply approximate Chatbot Arena model rankings via automatic benchmarking with VLM as a judge.
Dataset Sources
Repository: https://github.com/lm-sys/FastChat
Paper: https://arxiv.org/abs/2412.08687
Automatic Evaluation Code: Coming Soon!
Dataset Structure
question_id: The unique hash representing the… See the full description on the dataset page: https://huggingface.co/datasets/lmarena-ai/vision-arena-bench-v0.1.
What can I do with this?
Tags
task_categories:visual-question-answeringlicense:mitsize_categories:n<1Kformat:parquetmodality:imagemodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsarxiv:2412.08687region:us