Question 1

What is the The New York Times Annotated Corpus dataset?

Accepted Answer

Dataset contains over 1.8 million articles written and published by the New York Times between January 1, 1987 and June 19, 2007 with article metadata provided by the New York Times Newsroom.

Question 2

Is The New York Times Annotated Corpus a benchmark?

Accepted Answer

The New York Times Annotated Corpus is a dataset for training or evaluation; it isn't tracked as a standard LLM benchmark in our catalog.

Question 3

Where can I download The New York Times Annotated Corpus?

Accepted Answer

The New York Times Annotated Corpus is available at its source: https://catalog.ldc.upenn.edu/LDC2008T19.

The New York Times Annotated Corpus

About The New York Times Annotated Corpus

Details

FAQ