CodeUnknown

SWE-bench_Pro

by ScaleAI

Silver60
848.0Kdownloads
70likes

Description

Dataset Summary SWE-Bench Pro is a challenging, enterprise-level dataset for testing agent ability on long-horizon software engineering tasks. Paper: https://static.scale.com/uploads/654197dc94d34f66c0f5184e/SWEAP_Eval_Scale%20(9).pdf See the related evaluation Github: https://github.com/scaleapi/SWE-bench_Pro-os Dataset Structure We follow SWE-Bench Verified (https://huggingface.co/datasets/SWE-bench/SWE-bench_Verified) in terms of dataset structure, with several… See the full description on the dataset page: https://huggingface.co/datasets/ScaleAI/SWE-bench_Pro.

What can I do with this?

Tags

benchmark:officialbenchmark:eval-yamlsize_categories:n<1Kformat:parquetmodality:textlibrary:datasetslibrary:pandaslibrary:polarslibrary:mlcroissantregion:us