Preference & Alignment (DPO/RLHF)Preference LearningCommercial OK

vision-arena-bench-v0.1

by lmarena-ai

Bronze36

1.7Kdownloads

3likes

100<n<1K

Description

VisionArena-Bench: An automatic eval pipeline to estimate model preference rankings An automatic benchmark of 500 diverse user prompts that can be used to cheaply approximate Chatbot Arena model rankings via automatic benchmarking with VLM as a judge. Dataset Sources Repository: https://github.com/lm-sys/FastChat Paper: https://arxiv.org/abs/2412.08687 Automatic Evaluation Code: Coming Soon! Dataset Structure question_id: The unique hash representing the… See the full description on the dataset page: https://huggingface.co/datasets/lmarena-ai/vision-arena-bench-v0.1.

vision-arena-bench-v0.1

Description

What can I do with this?

Tags