Skip to content

DuoRC

Paraphrasing IdentificationEnglish

Created by Saha et al. at 2018, the DuoRC is a paraphrasing identification dataset in English containing 186,089 records in JSON format.

About DuoRC

Dataset contains 186,089 unique question-answer pairs created from a collection of 7,680 pairs of movie plots where each pair in the collection reflects two versions of the same movie.

Details

Task
Paraphrasing Identification
Language
English
Format
JSON
Rows / instances
186,089
Creator
Saha et al.
Year
2018
Download Paper

Related Paraphrasing Identification datasets

FAQ