Benchmarks & EvaluationReward ModelingUnknown
reward-bench-results
by allenai
22.9Kdownloads
3likes
Description
Results for Holisitic Evaluation of Reward Models (HERM) Benchmark
Here, you'll find the raw scores for the HERM project.
The repository is structured as follows.
├── best-of-n/ <- Nested directory for different completions on Best of N challenge
| ├── alpaca_eval/ └── results for each reward model
| | ├── tulu-13b/{org}/{model}.json
| | └── zephyr-7b/{org}/{model}.json
| └── mt_bench/
|… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench-results.
What can I do with this?
Tags
region:us