Skip to content

alpindale/two-million-bluesky-posts

General NLPENapache-2.0

Alpindale/two-million-bluesky-posts is a General NLP-focused dataset in EN distributed in Parquet format. It is distributed under the apache-2.0 license and falls in the 1M<n<10M size category, and has been downloaded 622 times.

About alpindale/two-million-bluesky-posts

2 Million Bluesky Posts This dataset contains 2 million public posts collected from Bluesky Social's firehose API, intended for machine learning research and experimentation with social media data. The with-language-predictions config contains ...

Details

Task
General NLP
Language
EN
Format
Parquet
Rows / instances
N/A
Size
1M<n<10M
Creator
alpindale
Year
2024
License
apache-2.0
Downloads
622
Likes
203
Download Homepage

Related General NLP datasets

FAQ