neuralwork/arxiver
General NLPEnglish
Neuralwork/arxiver is a General NLP-focused dataset in English distributed in Parquet format.
About neuralwork/arxiver
Arxiver Dataset
Arxiver consists of 63,357 arXiv papers converted to multi-markdown (.mmd) format. Our dataset includes original arXiv article IDs, titles, abstracts, authors, publication dates, URLs and corresponding markdown files published b...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- neuralwork
- Year
- 2024