Related papers: EBIC: an evolutionary-based parallel biclustering …

EBIC: an open source software for high-dimensional and big data biclustering analyses

Motivation: In this paper we present the latest release of EBIC, a next-generation biclustering algorithm for mining genetic data. The major contribution of this paper is adding support for big data, making it possible to efficiently run…

Genomics · Quantitative Biology 2024-09-05 Patryk Orzechowski , Jason H. Moore

EBIC.JL -- an Efficient Implementation of Evolutionary Biclustering Algorithm in Julia

Biclustering is a data mining technique which searches for local patterns in numeric tabular data with main application in bioinformatics. This technique has shown promise in multiple areas, including development of biomarkers for cancer,…

Machine Learning · Computer Science 2021-05-05 Paweł Renc , Patryk Orzechowski , Aleksander Byrski , Jarosław Wąs , Jason H. Moore

HBIC: A Biclustering Algorithm for Heterogeneous Datasets

Biclustering is an unsupervised machine-learning approach aiming to cluster rows and columns simultaneously in a data matrix. Several biclustering algorithms have been proposed for handling numeric datasets. However, real-world data mining…

Machine Learning · Computer Science 2024-08-26 Adán José-García , Julie Jacques , Clément Chauvet , Vincent Sobanski , Clarisse Dhaenens

Contributions to Biclustering of Microarray Data Using Formal Concept Analysis

Biclustering is an unsupervised data mining technique that aims to unveil patterns (biclusters) from gene expression data matrices. In the framework of this thesis, we propose new biclustering algorithms for microarray data. The latter is…

Machine Learning · Computer Science 2018-11-26 Amina Houari

Conjoined Dirichlet Process

Biclustering is a class of techniques that simultaneously clusters the rows and columns of a matrix to sort heterogeneous data into homogeneous blocks. Although many algorithms have been proposed to find biclusters, existing methods suffer…

Machine Learning · Statistics 2020-02-11 Michelle N. Ngo , Dustin S. Pluta , Alexander N. Ngo , Babak Shahbaba

An Efficient Genetic Algorithm for Discovering Diverse-Frequent Patterns

Working with exhaustive search on large dataset is infeasible for several reasons. Recently, developed techniques that made pattern set mining feasible by a general solver with long execution time that supports heuristic search and are…

Artificial Intelligence · Computer Science 2015-07-21 Shanjida Khatun , Hasib Ul Alam , Swakkhar Shatabda

Biclustering Algorithms Based on Metaheuristics: A Review

Biclustering is an unsupervised machine learning technique that simultaneously clusters rows and columns in a data matrix. Biclustering has emerged as an important approach and plays an essential role in various applications such as…

Machine Learning · Computer Science 2022-03-31 Adan Jose-Garcia , Julie Jacques , Vincent Sobanski , Clarisse Dhaenens

Evolutionary Biclustering of Clickstream Data

Biclustering is a two way clustering approach involving simultaneous clustering along two dimensions of the data matrix. Finding biclusters of web objects (i.e. web users and web pages) is an emerging topic in the context of web usage…

Neural and Evolutionary Computing · Computer Science 2011-06-14 R. Rathipriya , Dr. K. Thangavel , J. Bagyamani

Bayesian Cluster Enumeration Criterion for Unsupervised Learning

We derive a new Bayesian Information Criterion (BIC) by formulating the problem of estimating the number of clusters in an observed data set as maximization of the posterior probability of the candidate models. Given that some mild…

Statistics Theory · Mathematics 2018-08-28 Freweyni K. Teklehaymanot , Michael Muma , Abdelhak M. Zoubir

Distributed Bayesian clustering using finite mixture of mixtures

In many modern applications, there is interest in analyzing enormous data sets that cannot be easily moved across computers or loaded into memory on a single computer. In such settings, it is very common to be interested in clustering.…

Computation · Statistics 2020-05-15 Hanyu Song , Yingjian Wang , David B. Dunson

A Novel Granular-Based Bi-Clustering Method of Deep Mining the Co-Expressed Genes

Traditional clustering methods are limited when dealing with huge and heterogeneous groups of gene expression data, which motivates the development of bi-clustering methods. Bi-clustering methods are used to mine bi-clusters whose subsets…

Computer Vision and Pattern Recognition · Computer Science 2020-05-13 Kaijie Xu , Witold Pedrycz , Zhiwu Li , Yinghui Quan , Weike Nie

A New Heuristic for Feature Selection by Consistent Biclustering

Given a set of data, biclustering aims at finding simultaneous partitions in biclusters of its samples and of the features which are used for representing the samples. Consistent biclusterings allow to obtain correct classifications of the…

Machine Learning · Computer Science 2010-03-18 Antonio Mucherino , Sonia Cafieri

Scaling Hierarchical Agglomerative Clustering to Billion-sized Datasets

Hierarchical Agglomerative Clustering (HAC) is one of the oldest but still most widely used clustering methods. However, HAC is notoriously hard to scale to large data sets as the underlying complexity is at least quadratic in the number of…

Machine Learning · Computer Science 2021-05-26 Baris Sumengen , Anand Rajagopalan , Gui Citovsky , David Simcha , Olivier Bachem , Pradipta Mitra , Sam Blasiak , Mason Liang , Sanjiv Kumar

BiSC: An algorithm for discovering generalized permutation patterns

Theorems relating permutations with objects in other fields of mathematics are often stated in terms of avoided patterns. Examples include various classes of Schubert varieties from algebraic geometry (Billey and Abe 2013), commuting…

Combinatorics · Mathematics 2024-11-28 Henning Ulfarsson

A Semidefinite Programming-Based Branch-and-Cut Algorithm for Biclustering

Biclustering, also called co-clustering, block clustering, or two-way clustering, involves the simultaneous clustering of both the rows and columns of a data matrix into distinct groups, such that the rows and columns within a group display…

Optimization and Control · Mathematics 2024-12-06 Antonio M. Sudoso

Exact and Heuristic Algorithms for Constrained Biclustering

Biclustering, also known as co-clustering or two-way clustering, simultaneously partitions the rows and columns of a data matrix to reveal submatrices with coherent patterns. Incorporating background knowledge into clustering to enhance…

Optimization and Control · Mathematics 2026-02-24 Antonio M. Sudoso

Estimation of Gaussian Bi-Clusters with General Block-Diagonal Covariance Matrix and Applications

Bi-clustering is a technique that allows for the simultaneous clustering of observations and features in a dataset. This technique is often used in bioinformatics, text mining, and time series analysis. An important advantage of…

Computation · Statistics 2023-02-09 Anastasiia Livochka , Ryan Browne , Sanjeena Subedi

Extended BIC for linear regression models with diverging number of relevant features and high or ultra-high feature spaces

In many conventional scientific investigations with high or ultra-high dimensional feature spaces, the relevant features, though sparse, are large in number compared with classical statistical problems, and the magnitude of their effects…

Statistics Theory · Mathematics 2011-07-14 Shan Luo , Zehua Chen

Boolean Reasoning-Based Biclustering for Shifting Pattern Extraction

Biclustering is a powerful approach to search for patterns in data, as it can be driven by a function that measures the quality of diverse types of patterns of interest. However, due to its computational complexity, the exploration of the…

Machine Learning · Computer Science 2021-04-27 Marcin Michalak , Jesús S. Aguilar-Ruiz

Differential gene co-expression networks via Bayesian biclustering models

Identifying latent structure in large data matrices is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes…

Methodology · Statistics 2014-11-10 Chuan Gao , Shiwen Zhao , Ian C. McDowell , Christopher D. Brown , Barbara E. Engelhardt