Related papers: Dissimilarity-based Sparse Subset Selection
In this paper, we present a new image segmentation method based on the concept of sparse subset selection. Starting with an over-segmentation, we adopt local spectral histogram features to encode the visual information of the small segments…
Optimization approaches based on operator splitting are becoming popular for solving sparsity regularized statistical machine learning models. While many have proposed fast algorithms to solve these problems for a single regularization…
Motivated by distributed machine learning settings such as Federated Learning, we consider the problem of fitting a statistical model across a distributed collection of heterogeneous data sets whose similarity structure is encoded by a…
In many real-world problems, we are dealing with collections of high-dimensional data, such as images, videos, text and web documents, DNA microarray data, and more. Often, high-dimensional data lie close to low-dimensional structures…
This paper addresses the problem of identifying a very small subset of data points that belong to a significantly larger massive dataset (i.e., Big Data). The small number of selected data points must adequately represent and faithfully…
The parallel alternating direction method of multipliers (ADMM) algorithm is widely recognized for its effectiveness in handling large-scale datasets stored in a distributed manner, making it a popular choice for solving statistical…
Image foreground extraction is a classical problem in image processing and vision, with a large range of applications. In this dissertation, we focus on the extraction of text and graphics in mixed-content images, and design novel…
In several applications, input samples are more naturally represented in terms of similarities between each other, rather than in terms of feature vectors. In these settings, machine-learning algorithms can become very computationally…
In recent years, considerable attention has been devoted to the regularization models due to the presence of high-dimensional data in scientific research. Sparse support vector machine (SVM) are useful tools in high-dimensional data…
To better select the correct training sample and obtain the robust representation of the query sample, this paper proposes a discriminant-based sparse optimization learning model. This learning model integrates discriminant and sparsity…
Mathematical modelling, particularly through approaches such as structured sparse support vector machines (SS-SVM), plays a crucial role in processing data with complex feature structures, yet efficient algorithms for distributed…
Many computer vision systems require low-cost segmentation algorithms based on deep learning, either because of the enormous size of input images or limited computational budget. Common solutions uniformly downsample the input images to…
This paper introduces the Bi-linear consensus Alternating Direction Method of Multipliers (Bi-cADMM), aimed at solving large-scale regularized Sparse Machine Learning (SML) problems defined over a network of computational nodes.…
We propose a method to reconstruct and cluster incomplete high-dimensional data lying in a union of low-dimensional subspaces. Exploring the sparse representation model, we jointly estimate the missing data while imposing the intrinsic…
We propose an automatic parameter selection strategy for the problem of image super-resolution for images corrupted by blur and additive white Gaussian noise with unknown standard deviation. The proposed approach exploits the structure of…
With the availability of extraordinarily huge data sets, solving the problems of distributed statistical methodology and computing for such data sets has become increasingly crucial in the big data area. In this paper, we focus on the…
Superpixel segmentation aims at dividing the input image into some representative regions containing pixels with similar and consistent intrinsic properties, without any prior knowledge about the shape and size of each superpixel. In this…
Distributed model fitting refers to the process of fitting a mathematical or statistical model to the data using distributed computing resources, such that computing tasks are divided among multiple interconnected computers or nodes, often…
Teaching requires distilling a rich category distribution into a small set of informative exemplars. Although prior work shows that humans consider both representativeness and diversity when teaching, the computational principles underlying…
State-of-the-art subspace clustering methods are based on self-expressive model, which represents each data point as a linear combination of other data points. By enforcing such representation to be sparse, sparse subspace clustering is…