English
Related papers

Related papers: The Pushshift Reddit Dataset

200 papers

This contribution argues that Reddit, as a massive, categorized, open-access dataset, is a useful data source, for "almost any topic". Hence, it can be used in data science, e.g. for knowledge exploration. This statement is backed-up with…

Information Retrieval · Computer Science 2024-10-15 Jan Sawicki , Maria Ganzha , Marcin Paprzycki , Amelia Bădică

The World Wide Web is a complex interconnected digital ecosystem, where information and attention flow between platforms and communities throughout the globe. These interactions co-construct how we understand the world, reflecting and…

Computers and Society · Computer Science 2025-04-17 Patrick Gildersleve , Anna Beers , Viviane Ito , Agustin Orozco , Francesca Tripodi

Online forums provide rich environments where users may post questions and comments about different topics. Understanding how people behave in online forums may shed light on the fundamental mechanisms by which collective thinking emerges…

Social and Information Networks · Computer Science 2020-06-05 Alexey N. Medvedev , Renaud Lambiotte , Jean-Charles Delvenne

Stress is a nigh-universal human experience, particularly in the online world. While stress can be a motivator, too much stress is associated with many negative health outcomes, making its identification useful across a range of domains.…

Computation and Language · Computer Science 2019-11-04 Elsbeth Turcan , Kathleen McKeown

There has been a dramatic increase in the popularity of utilizing social media data for research purposes within the biomedical community. In PubMed alone, there have been nearly 2,500 publication entries since 2014 that deal with analyzing…

Information Retrieval · Computer Science 2020-07-16 Ramya Tekumalla , Juan M. Banda

Music engagement spans diverse interactions with music, from selection and emotional response to its impact on behavior, identity, and social connections. Social media platforms provide spaces where such engagement can be observed in…

Information Retrieval · Computer Science 2025-09-25 Jatin Agarwala , George Paul , Nemani Harsha Vardhan , Vinoo Alluri

Wikipedia is a rich and invaluable source of information. Its central place on the Web makes it a particularly interesting object of study for scientists. Researchers from different domains used various complex datasets related to Wikipedia…

Information Retrieval · Computer Science 2019-03-21 Nicolas Aspert , Volodymyr Miz , Benjamin Ricaud , Pierre Vandergheynst

Social media is very popular for facilitating conversations about important topics and bringing forth insights and issues related to these topics. Reddit serves as a platform that fosters social interactions and hosts engaging discussions…

Social and Information Networks · Computer Science 2024-12-03 Smriti Janaswamy , Jeremy Blackburn

Social media data has become a vital resource for studying mental health, offering real-time insights into thoughts, emotions, and behaviors that traditional methods often miss. Progress in this area has been facilitated by benchmark…

Computation and Language · Computer Science 2025-11-27 Saad Mankarious , Ayah Zirikly , Daniel Wiechmann , Elma Kerz , Edward Kempa , Yu Qiao

Existing image editing models struggle to meet real-world demands. Despite excelling in academic benchmarks, they have yet to be widely adopted for real user needs. Datasets that power these models use artificial edits, lacking the scale…

Computer Vision and Pattern Recognition · Computer Science 2025-04-30 Peter Sushko , Ayana Bharadwaj , Zhi Yang Lim , Vasily Ilin , Ben Caffee , Dongping Chen , Mohammadreza Salehi , Cheng-Yu Hsieh , Ranjay Krishna

In the context of modern life, particularly in Industry 4.0 within the online space, emotions and moods are frequently conveyed through social media posts. The trend of sharing stories, thoughts, and feelings on these platforms generates a…

Machine Learning · Computer Science 2024-11-08 Hai-Yen Phan Nguyen , Phi-Lan Ly , Duc-Manh Le , Trong-Hop Do

In the past few years, Reddit -- a community-driven platform for submitting, commenting and rating links and text posts -- has grown exponentially, from a small community of users into one of the largest online communities on the Web. To…

Social and Information Networks · Computer Science 2014-06-24 Philipp Singer , Fabian Flöck , Clemens Meinhart , Elias Zeitfogel , Markus Strohmaier

As researchers use computational methods to study complex social behaviors at scale, the validity of this computational social science depends on the integrity of the data. On July 2, 2015, Jason Baumgartner published a dataset advertised…

Social and Information Networks · Computer Science 2018-09-05 Devin Gaffney , J. Nathan Matias

Proactively identifying misinformation spreaders is an important step towards mitigating the impact of fake news on our society. In this paper, we introduce a new contemporary Reddit dataset for fake news spreader analysis, called FACTOID,…

Social and Information Networks · Computer Science 2022-05-13 Flora Sakketou , Joan Plepi , Riccardo Cervero , Henri-Jacques Geiss , Paolo Rosso , Lucie Flek

Messaging platforms, especially those with a mobile focus, have become increasingly ubiquitous in society. These mobile messaging platforms can have deceivingly large user bases, and in addition to being a way for people to stay in touch,…

Social and Information Networks · Computer Science 2020-01-24 Jason Baumgartner , Savvas Zannettou , Megan Squire , Jeremy Blackburn

Selective exposure is the main driver for the economy of attention when consuming online content. We select information adhering to our system of beliefs and ignore dissenting information. However, even personal interest is likely to play a…

Computers and Society · Computer Science 2019-12-20 Carlo Michele Valensise , Matteo Cinelli , Alessandro Galeazzi , Walter Quattrociocchi

We present a method for mapping Reddit communities that accounts for temporal shifts, using quantitative and qualitative analyses of clustering techniques to produce high-quality, stable, and meaningful maps for researchers, journalists and…

Social and Information Networks · Computer Science 2024-10-15 Virginia Partridge , Jasmine Mangat , Rebecca Curran , Ryan McGrady , Ethan Zuckerman

As millions of people use ChatGPT for tasks such as education, writing assistance, and health advice, concerns have grown about how personal prompts and data are stored and used. This study explores how Reddit users collectively negotiate…

Computers and Society · Computer Science 2026-03-10 S M Mehedi Zaman , Saubhagya Joshi , Yiyi Wu

High-Frequency Trading (HFT) is pivotal in cryptocurrency markets, demanding rapid decision-making. Social media platforms like Reddit offer valuable, yet underexplored, information for such high-frequency, short-term trading. This paper…

Computation and Language · Computer Science 2025-07-09 Qiuhan Han , Qian Wang , Atsushi Yoshikawa , Masayuki Yamamura

Substance use disorders (SUDs) are a growing concern globally, necessitating enhanced understanding of the problem and its trends through data-driven research. Social media are unique and important sources of information about SUDs,…

Computation and Language · Computer Science 2024-05-13 Yao Ge , Sudeshna Das , Karen O'Connor , Mohammed Ali Al-Garadi , Graciela Gonzalez-Hernandez , Abeed Sarker
‹ Prev 1 2 3 10 Next ›