Related papers: EBIC: an evolutionary-based parallel biclustering …
Motivation: In this paper we present the latest release of EBIC, a next-generation biclustering algorithm for mining genetic data. The major contribution of this paper is adding support for big data, making it possible to efficiently run…
Biclustering is a data mining technique which searches for local patterns in numeric tabular data with main application in bioinformatics. This technique has shown promise in multiple areas, including development of biomarkers for cancer,…
Biclustering is an unsupervised machine-learning approach aiming to cluster rows and columns simultaneously in a data matrix. Several biclustering algorithms have been proposed for handling numeric datasets. However, real-world data mining…
Biclustering is an unsupervised data mining technique that aims to unveil patterns (biclusters) from gene expression data matrices. In the framework of this thesis, we propose new biclustering algorithms for microarray data. The latter is…
Biclustering is a class of techniques that simultaneously clusters the rows and columns of a matrix to sort heterogeneous data into homogeneous blocks. Although many algorithms have been proposed to find biclusters, existing methods suffer…
Working with exhaustive search on large dataset is infeasible for several reasons. Recently, developed techniques that made pattern set mining feasible by a general solver with long execution time that supports heuristic search and are…
Biclustering is an unsupervised machine learning technique that simultaneously clusters rows and columns in a data matrix. Biclustering has emerged as an important approach and plays an essential role in various applications such as…
Biclustering is a two way clustering approach involving simultaneous clustering along two dimensions of the data matrix. Finding biclusters of web objects (i.e. web users and web pages) is an emerging topic in the context of web usage…
We derive a new Bayesian Information Criterion (BIC) by formulating the problem of estimating the number of clusters in an observed data set as maximization of the posterior probability of the candidate models. Given that some mild…
In many modern applications, there is interest in analyzing enormous data sets that cannot be easily moved across computers or loaded into memory on a single computer. In such settings, it is very common to be interested in clustering.…
Traditional clustering methods are limited when dealing with huge and heterogeneous groups of gene expression data, which motivates the development of bi-clustering methods. Bi-clustering methods are used to mine bi-clusters whose subsets…
Given a set of data, biclustering aims at finding simultaneous partitions in biclusters of its samples and of the features which are used for representing the samples. Consistent biclusterings allow to obtain correct classifications of the…
Hierarchical Agglomerative Clustering (HAC) is one of the oldest but still most widely used clustering methods. However, HAC is notoriously hard to scale to large data sets as the underlying complexity is at least quadratic in the number of…
Theorems relating permutations with objects in other fields of mathematics are often stated in terms of avoided patterns. Examples include various classes of Schubert varieties from algebraic geometry (Billey and Abe 2013), commuting…
Biclustering, also called co-clustering, block clustering, or two-way clustering, involves the simultaneous clustering of both the rows and columns of a data matrix into distinct groups, such that the rows and columns within a group display…
Biclustering, also known as co-clustering or two-way clustering, simultaneously partitions the rows and columns of a data matrix to reveal submatrices with coherent patterns. Incorporating background knowledge into clustering to enhance…
Bi-clustering is a technique that allows for the simultaneous clustering of observations and features in a dataset. This technique is often used in bioinformatics, text mining, and time series analysis. An important advantage of…
In many conventional scientific investigations with high or ultra-high dimensional feature spaces, the relevant features, though sparse, are large in number compared with classical statistical problems, and the magnitude of their effects…
Biclustering is a powerful approach to search for patterns in data, as it can be driven by a function that measures the quality of diverse types of patterns of interest. However, due to its computational complexity, the exploration of the…
Identifying latent structure in large data matrices is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes…