Related papers: Distributed Abstraction Algorithm for Online Predi…
Writing correct distributed programs is hard. In spite of extensive testing and debugging, software faults persist even in commercial grade software. Many distributed systems, especially those employed in safety-critical environments,…
In the present paper we formally define the notion of abstract program slicing, a general form of program slicing where properties of data are considered instead of their exact value. This approach is applied to a language with numeric and…
Finding sparse cuts is an important tool in analyzing large-scale distributed networks such as the Internet and Peer-to-Peer networks, as well as large-scale graphs such as the web graph, online social communities, and VLSI circuits. In…
Graph clustering is a fundamental computational problem with a number of applications in algorithm design, machine learning, data mining, and analysis of social networks. Over the past decades, researchers have proposed a number of…
Many important applications across science, data analytics, and AI workloads depend on distributed matrix multiplication. Prior work has developed a large array of algorithms suitable for different problem sizes and partitionings including…
In this paper, we explore statistical versus computational trade-off to address a basic question in the application of a distributed algorithm: what is the minimal computational cost in obtaining statistical optimality? In smoothing spline…
This paper presents a new technique for data slicing of distributed programs running on a hierarchy of machines. Data slicing can be realized as a program transformation that partitions heaps of machines in a hierarchy into independent…
We develop an online learning method for prediction, which is important in problems with large and/or streaming data sets. We formulate the learning approach using a covariance-fitting methodology, and show that the resulting predictor has…
With the rapid growth of Internet technologies, cloud computing and social networks have become ubiquitous. An increasing number of people participate in social networks and massive online social data are obtained. In order to exploit…
Representing sorted integer sequences in small space is a central problem for large-scale retrieval systems such as Web search engines. Efficient query resolution, e.g., intersection or random access, is achieved by carefully partitioning…
Graph clustering is a fundamental computational problem with a number of applications in algorithm design, machine learning, data mining, and analysis of social networks. Over the past decades, researchers have proposed a number of…
Distributed abstract programs are a novel class of distributed optimization problems where (i) the number of variables is much smaller than the number of constraints and (ii) each constraint is associated to a network node. Abstract…
Distributed, online data mining systems have emerged as a result of applications requiring analysis of large amounts of correlated and high-dimensional data produced by multiple distributed data sources. We propose a distributed online data…
Distributed computation is a framework used to break down a complex computational task into smaller tasks and distributing them among computational nodes. Erasure correction codes have recently been introduced and have become a popular…
Prefix aggregation operation (also called scan), and its particular case, prefix summation, is an important parallel primitive and enjoys a lot of attention in the research literature. It is also used in many algorithms as one of the steps.…
In classic distributed graph problems, each instance on a graph specifies a space of feasible solutions (e.g. all proper ($\Delta+1$)-list-colorings of the graph), and the task of distributed algorithm is to construct a feasible solution…
Online prediction methods are typically presented as serial algorithms running on a single processor. However, in the age of web-scale prediction problems, it is increasingly common to encounter situations where a single processor cannot…
In distributional or average-case analysis, the goal is to design an algorithm with good-on-average performance with respect to a specific probability distribution. Distributional analysis can be useful for the study of general-purpose…
Online algorithms process their inputs piece by piece, taking irrevocable decisions for each data item. This model is too restrictive for most partitioning problems, since data that is yet to arrive may render it impossible to extend…
The problem of computing functions of values at the nodes in a network in a totally distributed manner, where nodes do not have unique identities and make decisions based only on local information, has applications in sensor, peer-to-peer,…