CodeUnknown

SWE-bench_Verified

by princeton-nlp

Silver63
694.3Kdownloads
316likes

Description

Dataset Summary SWE-bench Verified is a subset of 500 samples from the SWE-bench test set, which have been human-validated for quality. SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. See this post for more details on the human-validation process. The dataset collects 500 test Issue-Pull Request pairs from popular Python repositories. Evaluation is performed by unit test verification using post-PR behavior as the reference solution. The original… See the full description on the dataset page: https://huggingface.co/datasets/princeton-nlp/SWE-bench_Verified.

What can I do with this?

Tags

size_categories:n<1Kformat:parquetmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsregion:us