SFU Opinion and Comments Corpus (SOCC)
Text CorporaText ClassificationEnglish
Created by Kolhatkar et al. at 2018, the SFU Opinion and Comments Corpus (SOCC) is a text corpora dataset in English containing 663,173 records in CSV format.
About SFU Opinion and Comments Corpus (SOCC)
Dataset contains 10,339 opinion articles (editorials, columns, and op-eds) together with their 663,173 comments from 303,665 comment threads, from the main Canadian daily in English, The Globe and Mail, from January 2012 to December 2016. In addition there's a subset annotated corpus measuring toxicity, negation and its scope, and appraisal containing 1,043 annotated comments in responses to 10 different articles covering a variety of subjects: technology, immigration, terrorism, politics, budget, social issues, religion, property, and refugees.
Details
- Task
- Text Corpora, Text Classification
- Language
- English
- Format
- CSV
- Rows / instances
- 663,173
- Creator
- Kolhatkar et al.
- Year
- 2018