Skip to content

princeton-nlp/SWE-bench_Verified

General NLPEnglishBenchmark

The princeton-nlp/SWE-bench_Verified dataset is a English General NLP resource from princeton-nlp at 2026 comprising 500 examples.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About princeton-nlp/SWE-bench_Verified

Dataset Summary SWE-bench Verified is a subset of 500 samples from the SWE-bench test set, which have been human-validated for quality. SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. See this post for more...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
500
Creator
princeton-nlp
Year
2026
Download

Related General NLP datasets

FAQ