Related papers: Variability in data streams
Consider a network in which $n$ distributed nodes are connected to a single server. Each node continuously observes a data stream consisting of one value per discrete time step. The server has to continuously monitor a given parameter…
In this paper we study graph problems in dynamic streaming model, where the input is defined by a sequence of edge insertions and deletions. As many natural problems require $\Omega(n)$ space, where $n$ is the number of vertices, existing…
Stream mining poses unique challenges to machine learning: predictive models are required to be scalable, incrementally trainable, must remain bounded in size (even when the data stream is arbitrarily long), and be nonparametric in order to…
We consider a basic problem in the general data streaming model, namely, to estimate a vector $f \in \Z^n$ that is arbitrarily updated (i.e., incremented or decremented) coordinate-wise. The estimate $\hat{f} \in \Z^n$ must satisfy…
As graphs continue to grow in size, we seek ways to effectively process such data at scale. The model of streaming graph processing, in which a compact summary is maintained as each edge insertion/deletion is observed, is an attractive one.…
Many streaming algorithms provide only a high-probability relative approximation. These two relaxations, of allowing approximation and randomization, seem necessary -- for many streaming problems, both relaxations must be employed…
In the semi-streaming model, an algorithm must process any $n$-vertex graph by making one or few passes over a stream of its edges, use $O(n \cdot \text{polylog }n)$ words of space, and at the end of the last pass, output a solution to the…
We study the space complexity of estimating the diameter of a subset of points in an arbitrary metric space in the dynamic (turnstile) streaming model. The input is given as a stream of updates to a frequency vector $x \in \mathbb{Z}_{\geq…
We revisit one of the classic problems in the data stream literature, namely, that of estimating the frequency moments $F_p$ for $0 < p < 2$ of an underlying $n$-dimensional vector presented as a sequence of additive updates in a stream. It…
The study of parameterized streaming complexity on graph problems was initiated by Fafianie et al. (MFCS'14) and Chitnis et al. (SODA'15 and SODA'16). Simply put, the main goal is to design streaming algorithms for parameterized problems…
This paper studies streaming optimization problems that have objectives of the form $ \sum_{t=1}^Tf(\mathbf{x}_{t-1},\mathbf{x}_t)$. In particular, we are interested in how the solution $\hat{\mathbf{x} }_{t|T}$ for the $t$th frame of…
Parameterized complexity attempts to give a more fine-grained analysis of the complexity of problems: instead of measuring the running time as a function of only the input size, we analyze the running time with respect to additional…
We consider the \textsf{Unit Interval Selection} problem in the one-pass random order streaming model. Here, an algorithm is presented a sequence of $n$ unit-length intervals on the line that arrive in uniform random order, and the…
We consider the problem of monotone, submodular maximization over a ground set of size $n$ subject to cardinality constraint $k$. For this problem, we introduce the first deterministic algorithms with linear time complexity; these…
Analyzing massive data sets has been one of the key motivations for studying streaming algorithms. In recent years, there has been significant progress in analysing distributions in a streaming setting, but the progress on graph problems…
A central problem in data streams is to characterize which functions of an underlying frequency vector can be approximated efficiently. Recently there has been considerable effort in extending this problem to that of estimating functions of…
In this paper we introduce a notion of planarity for graphs that are presented in a streaming fashion. A $\textit{streamed graph}$ is a stream of edges $e_1,e_2,...,e_m$ on a vertex set $V$. A streamed graph is $\omega$-$\textit{stream…
Many problems on data streams have been studied at two extremes of difficulty: either allowing randomized algorithms, in the static setting (where they should err with bounded probability on the worst case stream); or when only…
We consider the problem of finding a minimum cut of a weighted graph presented as a single-pass stream. While graph sparsification in streams has been intensively studied, the specific application of finding minimum cuts in streams is less…
We study the problem of extracting a small subset of representative items from a large data stream. In many data mining and machine learning applications such as social network analysis and recommender systems, this problem can be…