Mikkel Thorup
We study discounted random walks in directed graphs. In each step, the walk either terminates with a constant probability $\alpha$, or proceeds to a random out-neighbor. Our goal is to estimate the probability $\pi(s, t)$ that a discounted…
We study the problem of estimating a vertex's PageRank within a constant relative error, with constant probability. We prove that an adaptive variant of the simple classic bidirectional algorithm is instance-optimal up to a polylogarithmic…
In the vertex connectivity augmentation problem, we are given an undirected $n$-vertex graph $G$, a set of links $L \subseteq \binom{V(G)}{2} \setminus E(G)$, and integers $\lambda$ and $k$. The task is to insert at most $k$ links from $L$…
Correlation clustering is a well-studied problem, first proposed by Bansal, Blum, and Chawla [Mach. Learn. '04]. The input is an unweighted, undirected graph. The problem is to cluster the vertices so as to minimize the number of edges…
We give a near-linear time 4-coloring algorithm for planar graphs, improving on the previous quadratic time algorithm by Robertson et al. from 1996. Such an algorithm cannot be achieved by the known proofs of the Four Color Theorem (4CT).…
The classic pivot based clustering algorithm of Ailon, Charikar and Chawla [JACM'08] is factor 3, but all concrete examples showing that it is no better than 3 are based on some very good clusters, e.g., a complete graph minus a matching.…
We study the computational complexity of locally estimating a node's PageRank centrality in a directed graph $G$. For any node $t$, its PageRank centrality $\pi(t)$ is defined as the probability that a random walk in $G$, starting from a…
Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of…
We present a randomized $\tilde{O}(n^{3.5})$-time algorithm for computing \emph{optimal energetic paths} for an electric car between all pairs of vertices in an $n$-vertex directed graph with positive and negative \emph{costs}. The optimal…
In the Correlation Clustering problem we are given $n$ nodes, and a preference for each pair of nodes indicating whether we prefer the two endpoints to be in the same cluster or not. The output is a clustering inducing the minimum number of…
Hash-based sampling and estimation are common themes in computing. Using hashing for sampling gives us the coordination needed to compare samples from different sets. Hashing is also used when we want to count distinct elements. The quality…
Suppose we have a memory storing $0$s and $1$s and we want to estimate the frequency of $1$s by sampling. We want to do this I/O-efficiently, exploiting that each read gives a block of $B$ bits at unit cost; not just one bit. If the input…
Correlation Clustering is a classic clustering objective arising in numerous machine learning and data mining applications. Given a graph $G=(V,E)$, the goal is to partition the vertex set into clusters so as to minimize the number of edges…
We consider the problem of coloring a 3-colorable graph in polynomial time using as few colors as possible. This is one of the most challenging problems in graph algorithms. In this paper using Blum's notion of ``progress'', we develop a…
We consider the $\textit{Similarity Sketching}$ problem: Given a universe $[u] = \{0,\ldots, u-1\}$ we want a random function $S$ mapping subsets $A\subseteq [u]$ into vectors $S(A)$ of size $t$, such that the Jaccard similarity $J(A,B) =…
An electric car equipped with a battery of a finite capacity travels on a road network with an infrastructure of charging stations. Each charging station has a possibly different cost per unit of energy. Traversing a given road segment…
Given a simple $n$-vertex, $m$-edge graph $G$ undergoing edge insertions and deletions, we give two new fully dynamic algorithms for exactly maintaining the edge connectivity of $G$ in $\tilde{O}(n)$ worst-case update time and…
Dynamic connectivity is one of the most fundamental problems in dynamic graph algorithms. We present a randomized Las Vegas dynamic connectivity data structure with $O(\log n(\log\log n)^2)$ amortized expected update time and $O(\log…
We present a deterministic fully dynamic algorithm with subpolynomial worst-case time per graph update such that after processing each update of the graph, the algorithm outputs a minimum cut of the graph if the graph has a cut of size at…
We revisit Nisan's classical pseudorandom generator (PRG) for space-bounded computation (STOC 1990) and its applications in streaming algorithms. We describe a new generator, HashPRG, that can be thought of as a symmetric version of Nisan's…