Related papers: Aggressive Aggregation: a New Paradigm for Program…
Within the framework of the augmented Lagrangian (AL), we propose a novel distributed optimization method, termed Distributed Augmented Lagrangian Decomposition (DALD), and provide a rigorous convergence proof for its standard version. To…
Program specialization is a program transformation methodology which improves program efficiency by exploiting the information about the input data which are available at compile time. We show that current techniques for program…
Learning the structure of Directed Acyclic Graphs (DAGs) presents a significant challenge due to the vast combinatorial search space of possible graphs, which scales exponentially with the number of nodes. Recent advancements have redefined…
The increasing number of flexible devices and distributed energy resources in power grids renders the coordination of transmission and distribution systems increasingly complex. In this paper, we discuss and compare two different approaches…
Partial evaluation (PE) is a powerful and general program optimization technique with many successful applications. However, it has never been investigated in the context of expressive rule-based languages like Maude, CafeOBJ, OBJ, ASF+SDF,…
Federated Learning is a popular distributed learning paradigm in machine learning. Meanwhile, composition optimization is an effective hierarchical learning model, which appears in many machine learning applications such as meta learning…
This dissertation explores block decomposable methods for large-scale optimization problems. It focuses on alternating direction method of multipliers (ADMM) schemes and block coordinate descent (BCD) methods. Specifically, it introduces a…
In modern data science problems, techniques for extracting value from big data require performing large-scale optimization over heterogenous, irregularly structured data. Much of this data is best represented as multi-relational graphs,…
This paper presents a novel optimization for differentiable programming named coarsening optimization. It offers a systematic way to synergize symbolic differentiation and algorithmic differentiation (AD). Through it, the granularity of the…
We propose a new variant of AMSGrad, a popular adaptive gradient based optimization algorithm widely used for training deep neural networks. Our algorithm adds prior knowledge about the sequence of consecutive mini-batch gradients and…
Despite the numerous uses of semidefinite programming (SDP) and its universal solvability via interior point methods (IPMs), it is rarely applied to practical large-scale problems. This mainly owes to the computational cost of IPMs that…
With the development of machine learning and Big Data, the concepts of linear and non-linear optimization techniques are becoming increasingly valuable for many quantitative disciplines. Problems of that nature are typically solved using…
Automatic code optimization is a complex process that typically involves the application of multiple discrete algorithms that modify the program structure irreversibly. However, the design of these algorithms is often monolithic, and they…
In this paper, we propose a methodology for partitioning and mapping computational intensive applications in reconfigurable hardware blocks of different granularity. A generic hybrid reconfigurable architecture is considered so as the…
We develop a novel gradient-based algorithm for optimizing nonsmooth nonconvex functions where nonsmoothness arises from explicit nonsmooth operators in the objective's analytical form. Our key innovation involves encoding active smooth…
In this paper, we investigate the empirical counterpart of Group Distributionally Robust Optimization (GDRO), which aims to minimize the maximal empirical risk across $m$ distinct groups. We formulate empirical GDRO as a…
This paper delves into the investigation of a distributed aggregative optimization problem within a network. In this scenario, each agent possesses its own local cost function, which relies not only on the local state variable but also on…
Stochastic gradient descent-based algorithms are widely used for training deep neural networks but often suffer from slow convergence. To address the challenge, we leverage the framework of the alternating direction method of multipliers…
Douglas-Rachford splitting and its equivalent dual formulation ADMM are widely used iterative methods in composite optimization problems arising in control and machine learning applications. The performance of these algorithms depends on…
The remarkable growth and significant success of machine learning have expanded its applications into programming languages and program analysis. However, a key challenge in adopting the latest machine learning methods is the representation…