What Social Media Platforms Miss About White Supremacist Speech
Principal Investigator(s): View help for Principal Investigator(s) Libby Hemphill, University of Michigan
Version: View help for Version V2
Name | File Type | Size | Last Modified |
---|---|---|---|
reddit_posts.txt | text/plain | 82.8 MB | 12/02/2021 06:38:AM |
reddit_sample.csv.gz | application/gzip | 46.9 MB | 12/06/2021 04:10:AM |
stormfront_post_data_processed.json.gz | application/gzip | 81.6 MB | 12/02/2021 06:39:AM |
stormfront_posts.txt | text/plain | 14.5 MB | 11/18/2021 10:33:AM |
Project Citation:
Hemphill, Libby. What Social Media Platforms Miss About White Supremacist Speech. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2023-01-30. https://doi.org/10.3886/E156161V2
Project Description
Summary:
View help for Summary
Data includes 274,668 posts scraped from Stormfront and 509,982 comments collected from the Reddit API. The following files are included:
The following Python modules were used for analysis:
- stormfront_posts.txt: one post per line, no post metadata
- reddit_posts.txt: one comment per line, no comment metadata
- stormfront_post_data_processed.json.gz: preprocessed posts from Stormfront, includes post metadata
- reddit_sample.csv.gz: preprocessed comments from Reddit, includes comment metadata
The following Python modules were used for analysis:
- Gensim's Lda Sequence model (https://radimrehurek.com/gensim/models/ldaseqmodel.html)
- Shifterator (https://shifterator.readthedocs.io/en/latest/)
- pyLDAvis (https://pyldavis.readthedocs.io/en/latest/readme.html)
Funding Sources:
View help for Funding Sources
Anti-Defamation League (n/a)
Related Publications
Published Versions
Report a Problem
Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.
This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.