Question 1

What is the AlgorithmicResearchGroup/s2orc_full dataset?

Accepted Answer

S2ORC Full — Semantic Scholar Open Research Corpus

A complete redistribution of the S2ORC dataset in Parquet format on Hugging Face, containing 14.5 million academic papers with full text, structured metadata, and citation information.

...

Question 2

Is AlgorithmicResearchGroup/s2orc_full a benchmark?

Accepted Answer

Yes — AlgorithmicResearchGroup/s2orc_full is used as an LLM benchmark. See model leaderboards in the Benchmarks section.

Question 3

Where can I download AlgorithmicResearchGroup/s2orc_full?

Accepted Answer

AlgorithmicResearchGroup/s2orc_full is available at its source: https://huggingface.co/datasets/AlgorithmicResearchGroup/s2orc_full.

Question 4

What license is AlgorithmicResearchGroup/s2orc_full released under?

Accepted Answer

AlgorithmicResearchGroup/s2orc_full is distributed under the odc-by license.

AlgorithmicResearchGroup/s2orc_full

About AlgorithmicResearchGroup/s2orc_full

Details

Related Text Generation, Feature Extraction, Text Classification datasets

FAQ