English
Related papers

Related papers: Large-Scale Clustering Based on Data Compression

200 papers

Dantzig-Wolfe decomposition (DWD) is a classical algorithm for solving large-scale linear programs whose constraint matrix involves a set of independent blocks coupled with a set of linking rows. The algorithm decomposes such a model into a…

Optimization and Control · Mathematics 2021-01-12 Mohamed El Tonbari , Shabbir Ahmed

The analysis of large datasets is often complicated by the presence of missing entries, mainly because most of the current machine learning algorithms are designed to work with full data. The main focus of this work is to introduce a…

Machine Learning · Computer Science 2018-01-08 Sunrita Poddar , Mathews Jacob

We address a large-scale and nonconvex optimization problem, involving an aggregative term. This term can be interpreted as the sum of the contributions of N agents to some common good, with N large. We investigate a relaxation of this…

Optimization and Control · Mathematics 2023-06-19 J. Frédéric Bonnans , Kang Liu , Nadia Oudjane , Laurent Pfeiffer , Cheng Wan

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Dantzig-Wolfe (DW) decomposition is a well-known technique in mixed-integer programming (MIP) for decomposing and convexifying constraints to obtain potentially strong dual bounds. We investigate cutting planes that can be derived using the…

Optimization and Control · Mathematics 2023-10-09 Rui Chen , Oktay Gunluk , Andrea Lodi

This paper discusses a deterministic clustering approach to capacitated resource allocation problems. In particular, the Deterministic Annealing (DA) algorithm from the data-compression literature, which bears a distinct analogy to the…

Optimization and Control · Mathematics 2016-06-22 Mayank Baranwal , Srinivasa M. Salapaka

The unit commitment problem is a short-term planning problem in the energy industry. Dantzig-Wolfe decomposition is a popular approach to solve the problem. This paper focuses on primal heuristics used with Dantzig-Wolfe decomposition. We…

Optimization and Control · Mathematics 2022-03-01 Nagisa Sugishita , Andreas Grothey , Ken McKinnon

We propose a clustering-based iterative algorithm to solve certain optimization problems in machine learning, where we start the algorithm by aggregating the original data, solving the problem on aggregated data, and then in subsequent…

Machine Learning · Statistics 2017-01-23 Young Woong Park , Diego Klabjan

In random allocation rules, typically first an optimal fractional point is calculated via solving a linear program. The calculated point represents a fractional assignment of objects or more generally packages of objects to agents. In order…

Computer Science and Game Theory · Computer Science 2016-08-16 Salman Fadaei

Clustering is a fundamental task in data mining and machine learning, particularly for analyzing large-scale data. In this paper, we introduce Clust-Splitter, an efficient algorithm based on nonsmooth optimization, designed to solve the…

Machine Learning · Computer Science 2026-03-19 Jenni Lampainen , Kaisa Joki , Napsu Karmitsa , Marko M. Mäkelä

We propose an algorithm to solve quasi-variational inequality problems, based on the Dantzig-Wolfe decomposition paradigm. Our approach solves in the subproblems variational inequalities, which is a simpler problem, while restricting…

Optimization and Control · Mathematics 2026-02-02 Manoel Jardim , Claudia Sagastizábal , Mikhail Solodov

Clustering large, mixed data is a central problem in data mining. Many approaches adopt the idea of k-means, and hence are sensitive to initialisation, detect only spherical clusters, and require a priori the unknown number of clusters. We…

Machine Learning · Statistics 2020-11-13 Joshua Tobin , Mimi Zhang

Clustering, or grouping, dataset elements based on similarity can be used not only to classify a dataset into a few categories, but also to approximate it by a relatively large number of representative elements. In the latter scenario,…

Machine Learning · Computer Science 2019-09-13 Tim Jaschek , Marko Bucyk , Jaspreet S. Oberoi

This paper considers a canonical clustering problem where one receives unlabeled samples drawn from a balanced mixture of two elliptical distributions and aims for a classifier to estimate the labels. Many popular methods including PCA and…

Machine Learning · Statistics 2021-11-30 Kaizheng Wang , Yuling Yan , Mateo Díaz

This paper is about how to partition decision variables while decomposing a large-scale optimization problem for the best performance of distributed solution methods. Solving a large-scale optimization problem sequen- tially can be…

Optimization and Control · Mathematics 2017-10-26 Yuchen Zheng , Ilbin Lee , Nicoleta Serban

Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies…

Databases · Computer Science 2020-03-11 Mujahid Sultan

State-of-the-art subspace clustering methods are based on self-expressive model, which represents each data point as a linear combination of other data points. By enforcing such representation to be sparse, sparse subspace clustering is…

Machine Learning · Computer Science 2020-05-05 Ying Chen , Chun-Guang Li , Chong You

We address the problem of un-supervised soft-clustering called micro-clustering. The aim of the problem is to enumerate all groups composed of records strongly related to each other, while standard clustering methods separate records at…

Data Structures and Algorithms · Computer Science 2016-06-07 Takeaki Uno , Hiroki Maegawa , Takanobu Nakahara , Yukinobu Hamuro , Ryo Yoshinaka , Makoto Tatsuta

Currently, data-driven discovery in biological sciences resides in finding segmentation strategies in multivariate data that produce sensible descriptions of the data. Clustering is but one of several approaches and sometimes falls short…

Quantitative Methods · Quantitative Biology 2022-08-12 Richard Tjörnhammar

In this paper, we propose a data based transformation for infinite-dimensional Gaussian processes and derive its limit theorem. For a classification problem, this transformation induces complete separation among the associated Gaussian…

Statistics Theory · Mathematics 2022-03-25 Juan A. Cuesta-Albertos , Subhajit Dutta
‹ Prev 1 2 3 10 Next ›