English
Related papers

Related papers: Directionally Dependent Multi-View Clustering Usin…

200 papers

Clustering multivariate data is a pervasive task in many applied problems, particularly in social studies and life science. Model-based approaches to clustering rely on mixture models, where each mixture component corresponds to the kernel…

Methodology · Statistics 2026-01-22 Laura Ferrini , Federico Castelletti

We introduce a copula mixture model to perform dependency-seeking clustering when co-occurring samples from different data sources are available. The model takes advantage of the great flexibility offered by the copulas framework to extend…

Methodology · Statistics 2012-07-03 Melanie Rey , Volker Roth

Various data modalities are common in real-world applications (e.g., electronic health records, medical images and clinical notes in healthcare). It is essential to develop multimodal learning methods to aggregate various information from…

Machine Learning · Computer Science 2025-11-06 Feng Wu , Tsai Hor Chan , Fuying Wang , Guosheng Yin , Lequan Yu

Due to the complexity of cancer, clustering algorithms have been used to disentangle the observed heterogeneity and identify cancer subtypes that can be treated specifically. While kernel based clustering approaches allow the use of more…

Machine Learning · Statistics 2018-11-21 Nora K. Speicher , Nico Pfeifer

The majority of model-based clustering techniques is based on multivariate Normal models and their variants. In this paper copulas are used for the construction of flexible families of models for clustering applications. The use of copulas…

Methodology · Statistics 2018-02-16 Ioannis Kosmidis , Dimitris Karlis

We propose a new approach for clustering DNA features using array CGH data from multiple tumor samples. We distinguish data-collapsing: joining contiguous DNA clones or probes with extremely similar data into regions, from clustering:…

Applications · Statistics 2010-12-21 Kyung In Kim , Etienne Roquain , Mark Van De Wiel

Modelling and understanding directional gene networks is a major challenge in biology as they play an important role in the architecture and function of genetic systems. Copula Directional Dependence (CDD) can measure the directed…

Methodology · Statistics 2022-03-11 Vasiliki Vamvaka , Clara Grazian

Clustering is commonly performed as an initial analysis step for uncovering structure in 'omics datasets, e.g. to discover molecular subtypes of disease. The high-throughput, high-dimensional nature of these datasets means that they provide…

Methodology · Statistics 2023-03-02 Paul D. W. Kirk , Filippo Pagani , Sylvia Richardson

We propose a novel method for multiple clustering that assumes a co-clustering structure (partitions in both rows and columns of the data matrix) in each view. The new method is applicable to high-dimensional data. It is based on a…

Cancer genomes exhibit a large number of different alterations that affect many genes in a diverse manner. It is widely believed that these alterations follow combinatorial patterns that have a strong connection with the underlying…

Machine Learning · Computer Science 2016-01-26 Jack P. Hou , Amin Emad , Gregory J. Puleo , Jian Ma , Olgica Milenkovic

Clinical and genomic models are both used to predict breast cancer outcomes, but they are often combined using simple linear rules that do not account for how their risk scores relate, especially at the extremes. Using the METABRIC breast…

Machine Learning · Computer Science 2025-11-25 Agnideep Aich , Sameera Hewage , Md Monzur Murshed

Many common clustering methods cannot be used for clustering multivariate longitudinal data in cases where variables exhibit high autocorrelations. In this article, a copula kernel mixture model (CKMM) is proposed for clustering data of…

Methodology · Statistics 2025-06-23 Xi Zhang , Orla A. Murphy , Paul D. McNicholas

Deep multi-view clustering seeks to utilize the abundant information from multiple views to improve clustering performance. However, most of the existing clustering methods often neglect to fully mine multi-view structural information and…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 Jinrong Cui , Xiaohuang Wu , Haitao Zhang , Chongjie Dong , Jie Wen

The majority of finite mixture models suffer from not allowing asymmetric tail dependencies within components and not capturing non-elliptical clusters in clustering applications. Since vine copulas are very flexible in capturing these…

Methodology · Statistics 2021-09-09 Özge Sahin , Claudia Czado

Working with annotated data is the cornerstone of supervised learning. Nevertheless, providing labels to instances is a task that requires significant human effort. Several critical real-world applications make things more complicated…

Computer Vision and Pattern Recognition · Computer Science 2025-09-10 Erencem Ozbey , Dimitrios I. Diochnos

Copula mixed models for trivariate (or bivariate) meta-analysis of diagnostic test accuracy studies accounting (or not) for disease prevalence have been proposed in the biostatistics literature to synthesize information. However, many…

Methodology · Statistics 2018-07-12 Aristidis K. Nikoloulopoulos

We propose a new methodology for selecting and ranking covariates associated with a variable of interest in a context of high-dimensional data under dependence but few observations. The methodology successively intertwines the clustering of…

Handling highly dependent data is crucial in clinical trials, particularly in fields related to ophthalmology. Incorrectly specifying the dependency structure can lead to biased inferences. Traditionally, models rely on three fixed…

Methodology · Statistics 2025-09-30 Shuyi Liang , Takeshi Emura , Chang-Xing Ma , Yijing Xin , Xin-Wei Huang

The task of clustering a set of objects based on multiple sources of data arises in several modern applications. We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These…

Machine Learning · Statistics 2015-12-01 Eric F. Lock , David B. Dunson

In the Pioneer 100 (P100) Wellness Project (Price and others, 2017), multiple types of data are collected on a single set of healthy participants at multiple timepoints in order to characterize and optimize wellness. One way to do this is…

Methodology · Statistics 2019-01-15 Lucy L. Gao , Jacob Bien , Daniela Witten
‹ Prev 1 2 3 10 Next ›