Name File Type Size Last Modified
  Twitter-COVID-dataset---Sep2021 09/11/2021 01:41:PM

Project Citation: 

Gupta, Raj, Vishwanath, Ajay, and Yang, Yinping. COVID-19 Twitter Dataset with Latent Topics, Sentiments and Emotions Attributes. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2021-11-04. https://doi.org/10.3886/E120321V11

Project Description

Summary:  View help for Summary This project aims to present a large dataset for researchers to discover public conversation on Twitter surrounding the COVID-19 pandemic. From 28 January 2020 to 1 September 2021, we collected over 198 million Twitter posts from more than 25 million unique users using four keywords: “corona”, “wuhan”, “nCov” and “covid”. Leveraging topic modeling techniques and pre-trained machine learning-based emotion analytic algorithms, we labeled each tweet with seventeen semantic attributes, including a) ten binary attributes indicating the tweet’s relevance or irrelevance to the top ten detected topics, b) five quantitative emotion attributes indicating the degree of intensity of the valence or sentiment (from 0: very negative to 1: very positive), and the degree of intensity of fear, anger, happiness and sadness emotions (from 0: not at all to 1: extremely intense), and c) two qualitative attributes indicating the sentiment category (very negative, negative, neutral or mixed, positive, very positive) and the dominant emotion category (fear, anger, happiness, sadness, no specific emotion) the tweet is mainly expressing. 

Scope of Project

Subject Terms:  View help for Subject Terms COVID-19; pandemic; twitter; social media; COVID-19; pandemic; twitter; social media; sentiment analysis; emotion recognition
Geographic Coverage:  View help for Geographic Coverage Global
Time Period(s):  View help for Time Period(s) 1/28/2020 – 9/1/2021
Universe:  View help for Universe Twitter posts
Data Type(s):  View help for Data Type(s) other; program source code; text
Collection Notes:  View help for Collection Notes  The latest version has data updated up to 1 Sep 2021, including additional csv downloads for 29 countries)


Related Publications

Export Metadata

Report a Problem

Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.

This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.