Skip to content

CShorten/ML-ArXiv-Papers

General NLPEnglish

The CShorten/ML-ArXiv-Papers dataset is a English General NLP resource from CShorten at 2022. With 4.2K downloads and 67 likes, it is actively used by the community. It is released under the afl-3.0 license and is a 100K<n<1M-scale dataset.

About CShorten/ML-ArXiv-Papers

This dataset contains the subset of ArXiv papers with the "cs.LG" tag to indicate the paper is about Machine Learning. The core dataset is filtered from the full ArXiv dataset hosted on Kaggle: https://www.kaggle.com/datasets/Cornell-University/ar...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Size
100K<n<1M
Creator
CShorten
Year
2022
License
afl-3.0
Downloads
4198
Likes
67
Download Homepage

Related General NLP datasets

FAQ