Preference & Alignment (DPO/RLHF)Preference LearningCommercial OK

vision-arena-bench-v0.1

by lmarena-ai

Bronze36
1.7Kdownloads
3likes
100<n<1K

Description

VisionArena-Bench: An automatic eval pipeline to estimate model preference rankings An automatic benchmark of 500 diverse user prompts that can be used to cheaply approximate Chatbot Arena model rankings via automatic benchmarking with VLM as a judge. Dataset Sources Repository: https://github.com/lm-sys/FastChat Paper: https://arxiv.org/abs/2412.08687 Automatic Evaluation Code: Coming Soon! Dataset Structure question_id: The unique hash representing the… See the full description on the dataset page: https://huggingface.co/datasets/lmarena-ai/vision-arena-bench-v0.1.

What can I do with this?

Tags

task_categories:visual-question-answeringlicense:mitsize_categories:n<1Kformat:parquetmodality:imagemodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsarxiv:2412.08687region:us