English
Related papers

Related papers: Data-Driven Tree Transforms and Metrics

200 papers

Clustered data, which arise when observations are nested within groups, are incredibly common in clinical, education, and social science research. Traditionally, a linear mixed model, which includes random effects to account for…

Methodology · Statistics 2026-02-04 Kevin McCoy , Zachary Wooten , Katarzyna Tomczak , Christine B. Peterson

In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered--with no particular meaning to the given order of the variables. Yet, successful learning is often…

Methodology · Statistics 2008-07-25 Ann B. Lee , Boaz Nadler , Larry Wasserman

'Big' high-dimensional data are commonly analyzed in low-dimensions, after performing a dimensionality-reduction step that inherently distorts the data structure. For the same purpose, clustering methods are also often used. These methods…

Machine Learning · Statistics 2019-02-20 Tom Lorimer , Karlis Kanders , Ruedi Stoop

Model-based clustering is widely used for identifying and distinguishing types of diseases. However, modern biomedical data coming with high dimensions make it challenging to perform the model estimation in traditional cluster analysis. The…

Methodology · Statistics 2025-07-22 Kazeem Kareem , Fan Dai

Rapid technological advances have allowed for molecular profiling across multiple omics domains from a single sample for clinical decision making in many diseases, especially cancer. As tumor development and progression are dynamic…

Methodology · Statistics 2022-02-11 Dongyan Yan , Subharup Guha

Advances in data collecting technologies in genomics have significantly increased the need for tools designed to study the genetic basis of many diseases. Effective statistical methods should excel in both prediction accuracy and biomarker…

Methodology · Statistics 2025-11-13 Anthony-Alexander Christidis , Stefan Van Aelst , Ruben Zamar

The task of clustering a set of objects based on multiple sources of data arises in several modern applications. We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These…

Machine Learning · Statistics 2015-12-01 Eric F. Lock , David B. Dunson

We propose a new approach for clustering DNA features using array CGH data from multiple tumor samples. We distinguish data-collapsing: joining contiguous DNA clones or probes with extremely similar data into regions, from clustering:…

Applications · Statistics 2010-12-21 Kyung In Kim , Etienne Roquain , Mark Van De Wiel

The potential benefits of applying machine learning methods to -omics data are becoming increasingly apparent, especially in clinical settings. However, the unique characteristics of these data are not always well suited to machine learning…

Cancer has relational information residing at varying scales, modalities, and resolutions of the acquired data, such as radiology, pathology, genomics, proteomics, and clinical records. Integrating diverse data types can improve the…

Machine Learning · Computer Science 2024-07-29 Asim Waqas , Aakash Tripathi , Ravi P. Ramachandran , Paul Stewart , Ghulam Rasool

Understanding the global organization of complicated and high dimensional data is of primary interest for many branches of applied sciences. It is typically achieved by applying dimensionality reduction techniques mapping the considered…

Computational Geometry · Computer Science 2024-11-11 Paweł Dłotko , Davide Gurnari , Mathis Hallier , Anna Jurek-Loughrey

Connected acyclic graphs (trees) are data objects that hierarchically organize categories. Collections of trees arise in a diverse variety of fields, including evolutionary biology, public health, machine learning, social sciences and…

Methodology · Statistics 2025-12-01 Maria Alejandra Valdez Cabrera , Amy D Willis , Armeen Taeb

Over the past decades, statisticians and machine-learning researchers have developed literally thousands of new tools for the reduction of high-dimensional data in order to identify the variables most responsible for a particular trait.…

Machine Learning · Statistics 2012-05-31 Chamont Wang , Jana Gevertz , Chaur-Chin Chen , Leonardo Auslender

Multimodal image registration plays a key role in creating digital patient models by combining data from different imaging techniques into a single coordinate system. This process often involves multiple sequential and interconnected…

Computer Vision and Pattern Recognition · Computer Science 2025-05-23 Agnieszka Anna Tomaka , Dariusz Pojda , Michał Tarnawski , Leszek Luchowski

Personalized treatment of patients based on tissue-specific cancer subtypes has strongly increased the efficacy of the chosen therapies. Even though the amount of data measured for cancer patients has increased over the last years, most…

Machine Learning · Statistics 2017-09-18 Nora K. Speicher , Nico Pfeifer

Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and…

Machine Learning · Computer Science 2021-01-22 Jinxiong Zhang

In a world abundant with diverse data arising from complex acquisition techniques, there is a growing need for new data analysis methods. In this paper we focus on high-dimensional data that are organized into several hierarchical datasets.…

Machine Learning · Computer Science 2021-04-06 Lior Aloni , Omer Bobrowski , Ronen Talmon

We propose a model-based clustering algorithm for a general class of functional data for which the components could be curves or images. The random functional data realizations could be measured with error at discrete, and possibly random,…

Machine Learning · Statistics 2022-03-14 Steven Golovkine , Nicolas Klutchnikoff , Valentin Patilea

We present an approach to model-based hierarchical clustering by formulating an objective function based on a Bayesian analysis. This model organizes the data into a cluster hierarchy while specifying a complex feature-set partitioning that…

Machine Learning · Computer Science 2013-01-18 Shivakumar Vaithyanathan , Byron E Dom

The ongoing explosion of genome sequence data is transforming how we reconstruct and understand the histories of biological systems. Across biological scales, from individual cells to populations and species, trees-based models provide a…

Populations and Evolution · Quantitative Biology 2025-12-08 Yun Deng , Shing H. Zhan , Yulin Zhang , Chao Zhang , Bingjie Chen
‹ Prev 1 2 3 10 Next ›