SWE-bench/SWE-bench_Verified
General NLPEnglishBenchmark
SWE-bench/SWE-bench_Verified is a General NLP benchmark dataset in English from SWE-bench in Parquet format.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About SWE-bench/SWE-bench_Verified
Dataset Summary
SWE-bench Verified is a subset of 500 samples from the SWE-bench test set, which have been human-validated for quality. SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. See this post for more...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- SWE-bench
- Year
- 2025