Name File Type Size Last Modified
reddit_posts.txt text/plain 82.8 MB 12/02/2021 06:38:AM
reddit_sample.csv.gz application/gzip 46.9 MB 12/06/2021 04:10:AM
stormfront_post_data_processed.json.gz application/gzip 81.6 MB 12/02/2021 06:39:AM
stormfront_posts.txt text/plain 14.5 MB 11/18/2021 10:33:AM

Project Citation: 

Hemphill, Libby. What Social Media Platforms Miss About White Supremacist Speech. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2023-01-30. https://doi.org/10.3886/E156161V2

Project Description

Summary:  View help for Summary Data includes 274,668 posts scraped from Stormfront and 509,982 comments collected from the Reddit API. The following files are included:
  • stormfront_posts.txt: one post per line, no post metadata
  • reddit_posts.txt: one comment per line, no comment metadata
  • stormfront_post_data_processed.json.gz: preprocessed posts from Stormfront, includes post metadata
  • reddit_sample.csv.gz: preprocessed comments from Reddit, includes comment metadata
Twitter data used in the report is not available for public reuse because of Twitter's terms of service and our data use agreement with VOX-Pol.

The following Python modules were used for analysis:
Funding Sources:  View help for Funding Sources Anti-Defamation League (n/a)



Related Publications

Published Versions

Export Metadata

Report a Problem

Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.

This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.