Related papers: CRUSH: fast and scalable data reduction for imagin…

Fast clustering for scalable statistical analysis on structured images

The use of brain images as markers for diseases or behavioral differences is challenged by the small effects size and the ensuing lack of power, an issue that has incited researchers to rely more systematically on large cohorts. Coupled…

Machine Learning · Statistics 2015-11-17 Bertrand Thirion , Andrés Hoyos-Idrobo , Jonas Kahn , Gael Varoquaux

Scanning and Sequential Decision Making for Multi-Dimensional Data - Part I: the Noiseless Case

We investigate the problem of scanning and prediction ("scandiction", for short) of multidimensional data arrays. This problem arises in several aspects of image and video processing, such as predictive coding, for example, where an image…

Information Theory · Computer Science 2007-07-13 Asaf Cohen , Neri Merhav , Tsachy Weissman

Spectral Clustering with Smooth Tiny Clusters

Spectral clustering is one of the most prominent clustering approaches. The distance-based similarity is the most widely used method for spectral clustering. However, people have already noticed that this is not suitable for multi-scale…

Machine Learning · Computer Science 2020-09-11 Hengrui Wang , Yubo Zhang , Mingzhi Chen , Tong Yang

StruClus: Structural Clustering of Large-Scale Graph Databases

We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the…

Databases · Computer Science 2016-10-03 Till Schäfer , Petra Mutzel

Seismic noise suppression: array stations, waveform cross-correlation, and noise stochastization

Seismic noise with an amplitude higher than that of the sought signal is a challenge for detection. Several techniques have been developed to suppress the ambient noise and to reduce the detection threshold in order to find signals with the…

Geophysics · Physics 2026-04-27 Ivan Kitov

An Overview on Clustering Methods

Clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Clustering is the process of grouping similar…

Data Structures and Algorithms · Computer Science 2012-05-08 T. Soni Madhulatha

CRH: A Simple Benchmark Approach to Continuous Hashing

In recent years, the distinctive advancement of handling huge data promotes the evolution of ubiquitous computing and analysis technologies. With the constantly upward system burden and computational complexity, adaptive coding has been a…

Computer Vision and Pattern Recognition · Computer Science 2018-10-16 Miao Cheng , Ah Chung Tsoi

SCHENO: Measuring Schema vs. Noise in Graphs

Real-world data is typically a noisy manifestation of a core pattern (schema), and the purpose of data mining algorithms is to uncover that pattern, thereby splitting (i.e. decomposing) the data into schema and noise. We introduce SCHENO, a…

Databases · Computer Science 2025-02-05 Justus Isaiah Hibshman , Adnan Hoq , Tim Weninger

CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis

In this paper, we propose a novel score-base generative model for unconditional raw audio synthesis. Our proposal builds upon the latest developments on diffusion process modeling with stochastic differential equations, which already…

Sound · Computer Science 2021-06-15 Simon Rouard , Gaëtan Hadjeres

Multidimensional Contrast Limited Adaptive Histogram Equalization

Contrast enhancement is an important preprocessing technique for improving the performance of downstream tasks in image processing and computer vision. Among the existing approaches based on nonlinear histogram transformations, contrast…

Image and Video Processing · Electrical Eng. & Systems 2020-05-19 Vincent Stimper , Stefan Bauer , Ralph Ernstorfer , Bernhard Schölkopf , R. Patrick Xian

Clustering via Boundary Erosion

Clustering analysis identifies samples as groups based on either their mutual closeness or homogeneity. In order to detect clusters in arbitrary shapes, a novel and generic solution based on boundary erosion is proposed. The clusters are…

Computer Vision and Pattern Recognition · Computer Science 2018-04-16 Cheng-Hao Deng , Wan-Lei Zhao

Enabling Efficient Dynamic Resizing of Large DRAM Caches via A Hardware Consistent Hashing Mechanism

Die-stacked DRAM has been proposed for use as a large, high-bandwidth, last-level cache with hundreds or thousands of megabytes of capacity. Not all workloads (or phases) can productively utilize this much cache space, however.…

Hardware Architecture · Computer Science 2016-02-03 Kevin K. Chang , Gabriel H. Loh , Mithuna Thottethodi , Yasuko Eckert , Mike O'Connor , Srilatha Manne , Lisa Hsu , Lavanya Subramanian , Onur Mutlu

Cluster-based Kriging Approximation Algorithms for Complexity Reduction

Kriging or Gaussian Process Regression is applied in many fields as a non-linear regression model as well as a surrogate model in the field of evolutionary computation. However, the computational and space complexity of Kriging, that is…

Machine Learning · Computer Science 2017-02-07 Bas van Stein , Hao Wang , Wojtek Kowalczyk , Michael Emmerich , Thomas Bäck

FLASC: A Flare-Sensitive Clustering Algorithm

Clustering algorithms are often used to find subpopulations in exploratory data analysis workflows. Not only the clusters themselves, but also their shape can represent meaningful subpopulations. In this paper, we present FLASC, an…

Machine Learning · Computer Science 2025-04-23 D. M. Bot , J. Peeters , J. Liesenborgs , J. Aerts

SHADE: Deep Density-based Clustering

Detecting arbitrarily shaped clusters in high-dimensional noisy data is challenging for current clustering methods. We introduce SHADE (Structure-preserving High-dimensional Analysis with Density-based Exploration), the first deep…

Machine Learning · Computer Science 2024-10-10 Anna Beer , Pascal Weber , Lukas Miklautz , Collin Leiber , Walid Durani , Christian Böhm , Claudia Plant

When is Clustering Perturbation Robust?

Clustering is a fundamental data mining tool that aims to divide data into groups of similar items. Generally, intuition about clustering reflects the ideal case -- exact data sets endowed with flawless dissimilarity between individual…

Machine Learning · Computer Science 2016-01-25 Margareta Ackerman , Jarrod Moore

FLASH: Fast Bayesian Optimization for Data Analytic Pipelines

Modern data science relies on data analytic pipelines to organize interdependent computational steps. Such analytic pipelines often involve different algorithms across multiple steps, each with its own hyperparameters. To achieve the best…

Machine Learning · Computer Science 2016-06-27 Yuyu Zhang , Mohammad Taha Bahadori , Hang Su , Jimeng Sun

ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm which has the high-performance rate for dataset where clusters have the constant density of data points. One of the significant attributes…

Machine Learning · Computer Science 2019-02-06 Mohammad Mahmudur Rahman Khan , Md. Abu Bakr Siddique , Rezoana Bente Arif , Mahjabin Rahman Oishe

Data Smashing

Investigation of the underlying physics or biology from empirical data requires a quantifiable notion of similarity - when do two observed data sets indicate nearly identical generating processes, and when they do not. The discriminating…

Machine Learning · Computer Science 2014-01-07 Ishanu Chattopadhyay , Hod Lipson

Throughput Scaling Of Convolution For Error-Tolerant Multimedia Applications

Convolution and cross-correlation are the basis of filtering and pattern or template matching in multimedia signal processing. We propose two throughput scaling options for any one-dimensional convolution kernel in programmable processors…

Multimedia · Computer Science 2012-01-17 Mohammad Ashraful Anam , Yiannis Andreopoulos