bigcode/the-stack-v2
Text GenerationCODE
The bigcode/the-stack-v2 dataset is a CODE text generation resource from bigcode at 2024. With 29.1K downloads and 590 likes, it is actively used by the community. It is released under the other license and is a 1B<n<10B-scale dataset.
About bigcode/the-stack-v2
The Stack v2
The dataset consists of 4 versions:
bigcode/the-stack-v2: the full "The Stack v2" dataset <-- you are here
bigcode/the-stack-v2-dedup: based on the bigcode/the-stack-v2 but further near-deduplicated
bigcode/the-stack-v2-tr...
Details
- Task
- Text Generation
- Language
- CODE
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 1B<n<10B
- Creator
- bigcode
- Year
- 2024
- License
- other
- Downloads
- 29055
- Likes
- 590