Related papers: Large-Scale Clustering Based on Data Compression

Consensus-Based Dantzig-Wolfe Decomposition

Dantzig-Wolfe decomposition (DWD) is a classical algorithm for solving large-scale linear programs whose constraint matrix involves a set of independent blocks coupled with a set of linking rows. The algorithm decomposes such a model into a…

Optimization and Control · Mathematics 2021-01-12 Mohamed El Tonbari , Shabbir Ahmed

Clustering of Data with Missing Entries

The analysis of large datasets is often complicated by the presence of missing entries, mainly because most of the current machine learning algorithms are designed to work with full data. The main focus of this work is to introduce a…

Machine Learning · Computer Science 2018-01-08 Sunrita Poddar , Mathews Jacob

Large-scale nonconvex optimization: randomization, gap estimation, and numerical resolution

We address a large-scale and nonconvex optimization problem, involving an aggregative term. This term can be interpreted as the sum of the contributions of N agents to some common good, with N large. We investigate a relaxation of this…

Optimization and Control · Mathematics 2023-06-19 J. Frédéric Bonnans , Kang Liu , Nadia Oudjane , Laurent Pfeiffer , Cheng Wan

Efficient Large Scale Clustering based on Data Partitioning

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Recovering Dantzig-Wolfe Bounds by Cutting Planes

Dantzig-Wolfe (DW) decomposition is a well-known technique in mixed-integer programming (MIP) for decomposing and convexifying constraints to obtain potentially strong dual bounds. We investigate cutting planes that can be derived using the…

Optimization and Control · Mathematics 2023-10-09 Rui Chen , Oktay Gunluk , Andrea Lodi

Clustering with Capacity and Size Constraints: A Deterministic Approach

This paper discusses a deterministic clustering approach to capacitated resource allocation problems. In particular, the Deterministic Annealing (DA) algorithm from the data-compression literature, which bears a distinct analogy to the…

Optimization and Control · Mathematics 2016-06-22 Mayank Baranwal , Srinivasa M. Salapaka

Primal heuristics for Dantzig-Wolfe decomposition for unit commitment

The unit commitment problem is a short-term planning problem in the energy industry. Dantzig-Wolfe decomposition is a popular approach to solve the problem. This paper focuses on primal heuristics used with Dantzig-Wolfe decomposition. We…

Optimization and Control · Mathematics 2022-03-01 Nagisa Sugishita , Andreas Grothey , Ken McKinnon

An Aggregate and Iterative Disaggregate Algorithm with Proven Optimality in Machine Learning

We propose a clustering-based iterative algorithm to solve certain optimization problems in machine learning, where we start the algorithm by aggregating the original data, solving the problem on aggregated data, and then in subsequent…

Machine Learning · Statistics 2017-01-23 Young Woong Park , Diego Klabjan

Mechanism Design via Dantzig-Wolfe Decomposition

In random allocation rules, typically first an optimal fractional point is calculated via solving a linear program. The calculated point represents a fractional assignment of objects or more generally packages of objects to agents. In order…

Computer Science and Game Theory · Computer Science 2016-08-16 Salman Fadaei

Clust-Splitter - an Efficient Nonsmooth Optimization-Based Algorithm for Clustering Large Datasets

Clustering is a fundamental task in data mining and machine learning, particularly for analyzing large-scale data. In this paper, we introduce Clust-Splitter, an efficient algorithm based on nonsmooth optimization, designed to solve the…

Machine Learning · Computer Science 2026-03-19 Jenni Lampainen , Kaisa Joki , Napsu Karmitsa , Marko M. Mäkelä

A Dantzig-Wolfe Decomposition Method for Quasi-Variational Inequalities

We propose an algorithm to solve quasi-variational inequality problems, based on the Dantzig-Wolfe decomposition paradigm. Our approach solves in the subproblems variational inequalities, which is a simpler problem, while restricting…

Optimization and Control · Mathematics 2026-02-02 Manoel Jardim , Claudia Sagastizábal , Mikhail Solodov

Clustering of Big Data with Mixed Features

Clustering large, mixed data is a central problem in data mining. Many approaches adopt the idea of k-means, and hence are sensitive to initialisation, detect only spherical clusters, and require a priori the unknown number of clusters. We…

Machine Learning · Statistics 2020-11-13 Joshua Tobin , Mimi Zhang

A Quantum Annealing-Based Approach to Extreme Clustering

Clustering, or grouping, dataset elements based on similarity can be used not only to classify a dataset into a few categories, but also to approximate it by a relatively large number of representative elements. In the latter scenario,…

Machine Learning · Computer Science 2019-09-13 Tim Jaschek , Marko Bucyk , Jaspreet S. Oberoi

Efficient Clustering for Stretched Mixtures: Landscape and Optimality

This paper considers a canonical clustering problem where one receives unlabeled samples drawn from a balanced mixture of two elliptical distributions and aims for a classifier to estimate the labels. Many popular methods including PCA and…

Machine Learning · Statistics 2021-11-30 Kaizheng Wang , Yuling Yan , Mateo Díaz

Variable Partitioning for Distributed Optimization

This paper is about how to partition decision variables while decomposing a large-scale optimization problem for the best performance of distributed solution methods. Solving a large-scale optimization problem sequen- tially can be…

Optimization and Control · Mathematics 2017-10-26 Yuchen Zheng , Ilbin Lee , Nicoleta Serban

Probabilistic Partitive Partitioning (PPP)

Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies…

Databases · Computer Science 2020-03-11 Mujahid Sultan

Stochastic Sparse Subspace Clustering

State-of-the-art subspace clustering methods are based on self-expressive model, which represents each data point as a linear combination of other data points. By enforcing such representation to be sparse, sparse subspace clustering is…

Machine Learning · Computer Science 2020-05-05 Ying Chen , Chun-Guang Li , Chong You

Micro-Clustering: Finding Small Clusters in Large Diversity

We address the problem of un-supervised soft-clustering called micro-clustering. The aim of the problem is to enumerate all groups composed of records strongly related to each other, while standard clustering methods separate records at…

Data Structures and Algorithms · Computer Science 2016-06-07 Takeaki Uno , Hiroki Maegawa , Takanobu Nakahara , Yukinobu Hamuro , Ryo Yoshinaka , Makoto Tatsuta

Clustering Optimisation Method for Highly Connected Biological Data

Currently, data-driven discovery in biological sciences resides in finding segmentation strategies in multivariate data that produce sensible descriptions of the data. Clustering is but one of several approaches and sometimes falls short…

Quantitative Methods · Quantitative Biology 2022-08-12 Richard Tjörnhammar

On Perfect Classification and Clustering for Gaussian Processes

In this paper, we propose a data based transformation for infinite-dimensional Gaussian processes and derive its limit theorem. For a classification problem, this transformation induces complete separation among the associated Gaussian…

Statistics Theory · Mathematics 2022-03-25 Juan A. Cuesta-Albertos , Subhajit Dutta