Related papers: Network-Oblivious Algorithms

Asynchronous Parallel Algorithms for Nonconvex Optimization

We propose a new asynchronous parallel block-descent algorithmic framework for the minimization of the sum of a smooth nonconvex function and a nonsmooth convex one, subject to both convex and nonconvex constraints. The proposed framework…

Optimization and Control · Mathematics 2018-04-02 Loris Cannelli , Francisco Facchinei , Vyacheslav Kungurtsev , Gesualdo Scutari

Automatic Operator-level Parallelism Planning for Distributed Deep Learning -- A Mixed-Integer Programming Approach

As the artificial intelligence community advances into the era of large models with billions of parameters, distributed training and inference have become essential. While various parallelism strategies-data, model, sequence, and…

Machine Learning · Computer Science 2025-03-13 Ruifeng She , Bowen Pang , Kai Li , Zehua Liu , Tao Zhong

Update Rules for Parameter Estimation in Bayesian Networks

This paper re-examines the problem of parameter estimation in Bayesian networks with missing values and hidden variables from the perspective of recent work in on-line learning [Kivinen & Warmuth, 1994]. We provide a unified framework for…

Machine Learning · Computer Science 2013-02-08 Eric Bauer , Daphne Koller , Yoram Singer

Data Oblivious Algorithms for Multicores

As secure processors such as Intel SGX (with hyperthreading) become widely adopted, there is a growing appetite for private analytics on big data. Most prior works on data-oblivious algorithms adopt the classical PRAM model to capture…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-07-01 Vijaya Ramachandran , Elaine Shi

Optimal Oblivious Reconfigurable Networks

Oblivious routing has a long history in both the theory and practice of networking. In this work we initiate the formal study of oblivious routing in the context of reconfigurable networks, a new architecture that has recently come to the…

Data Structures and Algorithms · Computer Science 2021-11-18 Daniel Amir , Tegan Wilson , Vishal Shrivastav , Hakim Weatherspoon , Robert Kleinberg , Rachit Agarwal

Parallel Selective Algorithms for Big Data Optimization

We propose a decomposition framework for the parallel optimization of the sum of a differentiable (possibly nonconvex) function and a (block) separable nonsmooth, convex one. The latter term is usually employed to enforce structure in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-06-18 Francisco Facchinei , Gesualdo Scutari , Simone Sagratella

Flexible Parallel Algorithms for Big Data Optimization

We propose a decomposition framework for the parallel optimization of the sum of a differentiable function and a (block) separable nonsmooth, convex one. The latter term is typically used to enforce structure in the solution as, for…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-11-12 Francisco Facchinei , Simone Sagratella , Gesualdo Scutari

Improving the Space-Time Efficiency of Processor-Oblivious Matrix Multiplication Algorithms

Classic cache-oblivious parallel matrix multiplication algorithms achieve optimality either in time or space, but not both, which promotes lots of research on the best possible balance or tradeoff of such algorithms. We study modern…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-14 Yuan Tang

A Partition-insensitive Parallel Framework for Distributed Model Fitting

Distributed model fitting refers to the process of fitting a mathematical or statistical model to the data using distributed computing resources, such that computing tasks are divided among multiple interconnected computers or nodes, often…

Computation · Statistics 2024-06-04 Xiaofei Wu , Rongmei Liang , Fabio Roli , Marcello Pelillo , Jing Yuan

Exploring Differential Obliviousness

In a recent paper Chan et al. [SODA '19] proposed a relaxation of the notion of (full) memory obliviousness, which was introduced by Goldreich and Ostrovsky [J. ACM '96] and extensively researched by cryptographers. The new notion,…

Cryptography and Security · Computer Science 2019-10-04 Amos Beimel , Kobbi Nissim , Mohammad Zaheri

Compact Oblivious Routing

Oblivious routing is an attractive paradigm for large distributed systems in which centralized control and frequent reconfigurations are infeasible or undesired (e.g., costly). Over the last almost 20 years, much progress has been made in…

Networking and Internet Architecture · Computer Science 2018-12-27 Harald Räcke , Stefan Schmid

Model Parallelism on Distributed Infrastructure: A Literature Review from Theory to LLM Case-Studies

Neural networks have become a cornerstone of machine learning. As the trend for these to get more and more complex continues, so does the underlying hardware and software infrastructure for training and deployment. In this survey we answer…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-03-07 Felix Brakel , Uraz Odyurt , Ana-Lucia Varbanescu

Parallel Stochastic Optimization Framework for Large-Scale Non-Convex Stochastic Problems

In this paper, we consider the problem of stochastic optimization, where the objective function is in terms of the expectation of a (possibly non-convex) cost function that is parametrized by a random variable. While the convergence speed…

Information Theory · Computer Science 2019-10-23 Naeimeh Omidvar , An Liu , Vincent Lau , Danny H. K. Tsang , Mohammad Reza Pakravan

Optimal quantum sampling on distributed databases

Quantum sampling, a fundamental subroutine in numerous quantum algorithms, involves encoding a given probability distribution in the amplitudes of a pure state. Given the hefty cost of large-scale quantum storage, we initiate the study of…

Quantum Physics · Physics 2025-06-10 Longyun Chen , Jingcheng Liu , Penghui Yao

Efficient Resource Oblivious Algorithms for Multicores

We consider the design of efficient algorithms for a multicore computing environment with a global shared memory and p cores, each having a cache of size M, and with data organized in blocks of size B. We characterize the class of…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-03-22 Richard Cole , Vijaya Ramachandran

CFP: Efficient Optimization of Intra-Operator Parallelism Plans for Large Model Training

Optimizing the parallel training of large models requires exploring intra-operator parallelism plans for a computation graph that typically contains tens of thousands of primitive operators. While the optimization of parallel data…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-08 Weifang Hu , Xuanhua Shi , Yunkai Zhang , Chang Wu , Xuan Peng , Jiaqi Zhai , Hai Jin , Xuehai Qian , Jingling Xue , Yongluan Zhou

Backpropagation training in adaptive quantum networks

We introduce a robust, error-tolerant adaptive training algorithm for generalized learning paradigms in high-dimensional superposed quantum networks, or \emph{adaptive quantum networks}. The formalized procedure applies standard…

Neurons and Cognition · Quantitative Biology 2015-05-13 Christopher Altman , Romàn R. Zapatrin

Optimal (Randomized) Parallel Algorithms in the Binary-Forking Model

In this paper we develop optimal algorithms in the binary-forking model for a variety of fundamental problems, including sorting, semisorting, list ranking, tree contraction, range minima, and ordered set union, intersection and difference.…

Data Structures and Algorithms · Computer Science 2020-06-26 Guy E. Blelloch , Jeremy T. Fineman , Yan Gu , Yihan Sun

Hybrid Random/Deterministic Parallel Algorithms for Nonconvex Big Data Optimization

We propose a decomposition framework for the parallel optimization of the sum of a differentiable {(possibly nonconvex)} function and a nonsmooth (possibly nonseparable), convex one. The latter term is usually employed to enforce structure…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-07-19 Amir Daneshmand , Francisco Facchinei , Vyacheslav Kungurtsev , Gesualdo Scutari

Many Sequential Iterative Algorithms Can Be Parallel and (Nearly) Work-efficient

To design efficient parallel algorithms, some recent papers showed that many sequential iterative algorithms can be directly parallelized but there are still challenges in achieving work-efficiency and high-parallelism. Work-efficiency can…

Data Structures and Algorithms · Computer Science 2022-05-27 Zheqi Shen , Zijin Wan , Yan Gu , Yihan Sun