Benchmarks & EvaluationReward ModelingUnknown

reward-bench-results

by allenai

Bronze43
22.9Kdownloads
3likes

Description

Results for Holisitic Evaluation of Reward Models (HERM) Benchmark Here, you'll find the raw scores for the HERM project. The repository is structured as follows. ├── best-of-n/ <- Nested directory for different completions on Best of N challenge | ├── alpaca_eval/ └── results for each reward model | | ├── tulu-13b/{org}/{model}.json | | └── zephyr-7b/{org}/{model}.json | └── mt_bench/ |… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench-results.

What can I do with this?

Tags

region:us