Related papers: Optimal (Randomized) Parallel Algorithms in the Bi…
The binary-forking model is a parallel computation model, formally defined by Blelloch et al. very recently, in which a thread can fork a concurrent child thread, recursively and asynchronously. The model incurs a cost of $\Theta(\log n)$…
Parallel machine scheduling has been extensively studied in the past decades, with applications ranging from production planning to job processing in large computing clusters. In this work we study some of these fundamental optimization…
The number of parameters in large-scale language models based on transformers is gradually increasing, and the scale of computing clusters is also growing. The technology of quickly mobilizing large amounts of computing resources for…
As secure processors such as Intel SGX (with hyperthreading) become widely adopted, there is a growing appetite for private analytics on big data. Most prior works on data-oblivious algorithms adopt the classical PRAM model to capture…
Parallelization is a popular strategy for improving the performance of iterative algorithms. Optimization methods are no exception: design of efficient parallel optimization methods and tight analysis of their theoretical properties are…
Many statistical learning problems can be posed as minimization of a sum of two convex functions, one typically a composition of non-smooth and linear functions. Examples include regression under structured sparsity assumptions. Popular…
Balanced search trees are widely used in computer science to efficiently maintain dynamic ordered data. To support efficient set operations (e.g., union, intersection, difference) using trees, the join-based framework is widely studied.…
Machine scheduling problems involving conflict jobs can be seen as a constrained version of the classical scheduling problem, in which some jobs are conflict in the sense that they cannot be proceeded simultaneously on different machines.…
Chance constrained program is computationally intractable due to the existence of chance constraints, which are randomly disturbed and should be satisfied with a probability. This paper proposes a two-layer randomized algorithm to address…
Optimistic parallelization is a promising approach for the parallelization of irregular algorithms: potentially interfering tasks are launched dynamically, and the runtime system detects conflicts between concurrent activities, aborting and…
The effective use of parallel computing resources to speed up algorithms in current multi-core parallel architectures remains a difficult challenge, with ease of programming playing a key role in the eventual success of various parallel…
In this paper, we consider an approach to the parallelizing of the algorithms realizing the modified probability changigng method with adaptation and partial rollback procedure for constrained pseudo-Boolean optimization problems. Existing…
It has been found that stochastic algorithms often find good solutions much more rapidly than inherently-batch approaches. Indeed, a very useful rule of thumb is that often, when solving a machine learning problem, an iterative technique…
Shared memory programming models usually provide worksharing and task constructs. The former relies on the efficient fork-join execution model to exploit structured parallelism; while the latter relies on fine-grained synchronization among…
Many tasks in data mining and related fields can be formalized as matching between objects in two heterogeneous domains, including collaborative filtering, link prediction, image tagging, and web search. Machine learning techniques,…
The research in parallel machine scheduling in combinatorial optimization suggests that the desirable parallel efficiency could be achieved when the jobs are sorted in the non-increasing order of processing times. In this paper, we find…
In this paper, we consider the problem of scheduling an application on a parallel computational platform. The application is a particular task graph, either a linear chain of tasks, or a set of independent tasks. The platform is made of…
We study exploration in stochastic multi-armed bandits when we have access to a divisible resource that can be allocated in varying amounts to arm pulls. We focus in particular on the allocation of distributed computing resources, where we…
Parallel parameterized complexity theory studies how fixed-parameter tractable (fpt) problems can be solved in parallel. Previous theoretical work focused on parallel algorithms that are very fast in principle, but did not take into account…
This paper investigates the execution of tree-shaped task graphs using multiple processors. Each edge of such a tree represents some large data. A task can only be executed if all input and output data fit into memory, and a data can only…