Skip to content

papluca/language-identification

Text ClassificationAR, BG, DE

Papluca/language-identification is a text classification-focused dataset in AR, BG, DE distributed in Parquet format.

About papluca/language-identification

Dataset Card for Language Identification dataset Dataset Summary The Language Identification dataset is a collection of 90k samples consisting of text passages and corresponding language label. This dataset was created by collecting...

Details

Task
Text Classification
Language
AR, BG, DE
Format
Parquet
Rows / instances
N/A
Creator
papluca
Year
2022
Download

Related Text Classification datasets

FAQ