Skip to content

ACL Anthology Reference Corpus (ACL ARC)

Text CorporaEnglishBenchmark

ACL Anthology Reference Corpus (ACL ARC) is a text corpora benchmark dataset in English from Lahiri et al. with 10,921 records in Text format.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About ACL Anthology Reference Corpus (ACL ARC)

Dataset contains 10,921 articles from the February 2007 snapshot of the Anthology; text and metadata for the articles were extracted, consisting of BibTeX records derived either from the headers of each paper or from metadata taken from the Anthology website.

Details

Task
Text Corpora
Language
English
Format
Text
Rows / instances
10,921
Creator
Lahiri et al.
Year
2014
Download Paper

Related Text Corpora datasets

FAQ