Skip to content

UIT-SPC

Text CorporaVietnamese

The UIT-SPC dataset is a Vietnamese text corpora resource from Thin et al. at 2017 comprising 1,565 examples.

About UIT-SPC

Dataset contains 1,565 papers of top NLP/CL conferences such as ACL, CoNLL , EACL NAACL and EMNLP. They are pre-processed by removing unnecessary information (e.g formula, table, etc). Then, they were formatted to .xml that includes the title paper, sections, and sub-sections according to the paper's structure. [requires contacting author for corpus]

Details

Task
Text Corpora
Language
Vietnamese
Format
n/a
Rows / instances
1,565
Creator
Thin et al.
Year
2017
Download

Related Text Corpora datasets

FAQ