Related papers: Macroscopic and microscopic statistical properties…
Collective human behaviors are analyzed using the time series of word appearances in blogs. As expected, we confirm that the number of fluctuations is approximated by a Poisson distribution for very-low-frequency words. A non-trivial…
We focus on the statistics of word occurrences and of the waiting times between such occurrences in Blogs. Due to the heterogeneity of words' frequencies, the empirical analysis is performed by studying classes of "frequently-equivalent"…
To uncover underlying mechanism of collective human dynamics, we survey more than 1.8 billion blog entries and observe the statistical properties of word appearances. We focus on words that show dynamic growth and decay with a tendency to…
To elucidate the non-trivial empirical statistical properties of fluctuations of a typical non-steady time series representing the appearance of words in blogs, we investigated approximately five billion Japanese blogs over a period of six…
In this paper, a statistical analysis of the structure of one blog community, a kind of social networks, is presented. The quantities such as degree distribution, clustering coefficient, average shortest path length are calculated to…
Background: Zipf's discovery that word frequency distributions obey a power law established parallels between biological and physical processes, and language, laying the groundwork for a complex systems perspective on human communication.…
Interactions between users in cyberspace may lead to phenomena different from those observed in common social networks. Here we analyse large data sets about users and Blogs which they write and comment, mapped onto a bipartite graph. In…
Zipf's law has been found in many human-related fields, including language, where the frequency of a word is persistently found as a power law function of its frequency rank, known as Zipf's law. However, there is much dispute whether it is…
An important body of quantitative linguistics is constituted by a series of statistical laws about language usage. Despite the importance of these linguistic laws, some of them are poorly formulated, and, more importantly, there is no…
Activity of users on Internet discussion forums is analyzed. The rank of users is shown to be approximated better by stretched-exponential function than by Zipfs law. Cumulative distribution function is found as an excellent tool in…
In this paper we quantify the statistical properties and dynamics of the frequency of hashtag use on Twitter. Hashtags are special words used in social media to attract attention and to organize content. Looking at the collection of all…
What dynamics govern a time series representing the appearance of words in social media data? In this paper, we investigate an elementary dynamics, from which word-dependent special effects are segregated, such as breaking news, increasing…
The characterization and understanding of online social network behavior is of importance from both the points of view of fundamental research and realistic utilization. In this manuscript, we propose a stochastic differential equation to…
On-line communities offer a great opportunity to investigate human dynamics, because much information about individuals is registered in databases. In this paper, based on data statistics of online comments on Blog posts, we first present…
One of the major sources of trending news, events and opinion in the current age is micro blogging. Twitter, being one of them, is extensively used to mine data about public responses and event updates. This paper intends to propose methods…
Zipf's law is just one out of many universal laws proposed to describe statistical regularities in language. Here we review and critically discuss how these laws can be statistically interpreted, fitted, and tested (falsified). The modern…
Citation cascades in blog networks are often considered as traces of information spreading on this social medium. In this work, we question this point of view using both a structural and semantic analysis of five months activity of the most…
In this paper we combine statistical analysis of large text databases and simple stochastic models to explain the appearance of scaling laws in the statistics of word frequencies. Besides the sublinear scaling of the vocabulary size with…
We build models for the distribution of social states in Twitter communities. States can be defined by the participation vs silence of individuals in conversations that surround key words, and we approximate the joint distribution of these…
The word-frequency distribution provides the fundamental building blocks that generate discourse in language. It is well known, from empirical evidence, that the word-frequency distribution of almost any text is described by Zipf's law, at…