Portuguese Newswire Corpus
Text CorporaPortuguese (Brazil)
Created by Boğaziçi University at 2016, the Portuguese Newswire Corpus is a text corpora dataset in Portuguese (Brazil) in HTML format.
About Portuguese Newswire Corpus
Dataset contains x number of newswire articles collected between years 1994-2016. Requires preprocesing of HTML pages, found in GitHub in the download link.
Details
- Task
- Text Corpora
- Language
- Portuguese (Brazil)
- Format
- HTML
- Rows / instances
- n/a
- Creator
- Boğaziçi University
- Year
- 2016