Vietnamese Datasets
We catalog 7 Vietnamese datasets for NLP and machine learning. Browse the list below or narrow down by task.
This page covers Vietnamese-language data. Our directory includes 7 datasets in Vietnamese.
Updated June 2026
- CC100-VietnameseText CorporaVietnamese
- Vietnamese Question Answering Dataset (ViQuAD)Question AnsweringVietnamese
- Vietnamese Multiple-choice Machine Reading Comprehension Corpus (ViMMRC)Question Answering, Reading ComprehensionVietnamese
- Vietnamese Students’ Feedback Corpus (UIT-VSFC)Text Classification, Sentiment AnalysisVietnamese
- UIT-SPCText CorporaVietnamese
- Vietnamese Social Media Emotion Corpus (UIT-VSMEC)Emotion ClassificationVietnamese
- Vietnamese Image Captioning Dataset (UIT-ViIC)Automatic Image CaptioningVietnamese