Description
Dataset Summary
SWE-bench Verified is a subset of 500 samples from the SWE-bench test set, which have been human-validated for quality. SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. See this post for more details on the human-validation process.
The dataset collects 500 test Issue-Pull Request pairs from popular Python repositories. Evaluation is performed by unit test verification using post-PR behavior as the reference solution.
The original… See the full description on the dataset page: https://huggingface.co/datasets/princeton-nlp/SWE-bench_Verified.
What can I do with this?
Tags
size_categories:n<1Kformat:parquetmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsregion:us