Rowan/hellaswag
General NLPENBenchmark
Rowan/hellaswag is a General NLP-focused benchmark dataset in EN that provides 59,950 labeled examples distributed in Parquet format.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About Rowan/hellaswag
Dataset Card for "hellaswag"
Dataset Summary
HellaSwag: Can a Machine Really Finish Your Sentence? is a new dataset for commonsense NLI. A paper was published at ACL2019.
Supported Tasks and Leaderboards
More Infor...
Details
- Task
- General NLP
- Language
- EN
- Format
- Parquet
- Rows / instances
- 59,950
- Creator
- Rowan
- Year
- 2026