Skip to content

Portuguese Newswire Corpus

Text CorporaPortuguese (Brazil)

Created by Boğaziçi University at 2016, the Portuguese Newswire Corpus is a text corpora dataset in Portuguese (Brazil) in HTML format.

About Portuguese Newswire Corpus

Dataset contains x number of newswire articles collected between years 1994-2016. Requires preprocesing of HTML pages, found in GitHub in the download link.

Details

Task
Text Corpora
Language
Portuguese (Brazil)
Format
HTML
Rows / instances
n/a
Creator
Boğaziçi University
Year
2016
Download

Related Text Corpora datasets

FAQ