Benchmarks & EvaluationReward ModelingUnknown

reward-bench-results

by allenai

Bronze43

22.9Kdownloads

3likes

Description

Results for Holisitic Evaluation of Reward Models (HERM) Benchmark Here, you'll find the raw scores for the HERM project. The repository is structured as follows. ├── best-of-n/ <- Nested directory for different completions on Best of N challenge | ├── alpaca_eval/ └── results for each reward model | | ├── tulu-13b/{org}/{model}.json | | └── zephyr-7b/{org}/{model}.json | └── mt_bench/ |… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench-results.

reward-bench-results

Description

What can I do with this?

Tags