paperbay.org Mastodon - https://paperbay.org/users/a/statuses/111278044007116613

Common Crawl September/October 2023 Crawl Archive (CC-MAIN-2023-40) is out and release.

100TiB compressed of fresh web crawled which can used in your next data mining project.

🔗 https://data.commoncrawl.org/crawl-data/CC-MAIN-2023-40/index.html

#commoncrawl #dataset #opendata #open #research