Related papers: Exploring data subsets with vtree

Treelets--An adaptive multi-scale basis for sparse unordered data

In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered--with no particular meaning to the given order of the variables. Yet, successful learning is often…

Methodology · Statistics 2008-07-25 Ann B. Lee , Boaz Nadler , Larry Wasserman

Energy Trees: Regression and Classification With Structured and Mixed-Type Covariates

The increasing complexity of data requires methods and models that can effectively handle intricate structures, as simplifying them would result in loss of information. While several analytical tools have been developed to work with complex…

Methodology · Statistics 2023-06-16 Riccardo Giubilei , Tullia Padellini , Pierpaolo Brutti

Juniper: A Tree+Table Approach to Multivariate Graph Visualization

Analyzing large, multivariate graphs is an important problem in many domains, yet such graphs are challenging to visualize. In this paper, we introduce a novel, scalable, tree+table multivariate graph visualization technique, which makes…

Human-Computer Interaction · Computer Science 2018-08-03 Carolina Nobre , Marc Streit , Alexander Lex

Multi-Scale Vector Quantization with Reconstruction Trees

We propose and study a multi-scale approach to vector quantization. We develop an algorithm, dubbed reconstruction trees, inspired by decision trees. Here the objective is parsimonious reconstruction of unsupervised data, rather than…

Machine Learning · Computer Science 2019-09-05 Enrico Cecini , Ernesto De Vito , Lorenzo Rosasco

New Algorithms on Wavelet Trees and Applications to Information Retrieval

Wavelet trees are widely used in the representation of sequences, permutations, text collections, binary relations, discrete points, and other succinct data structures. We show, however, that this still falls short of exploiting all of the…

Data Structures and Algorithms · Computer Science 2010-11-23 Travis Gagie , Gonzalo Navarro , Simon J. Puglisi

Quantification and visualization of variation in anatomical trees

This paper presents two approaches to quantifying and visualizing variation in datasets of trees. The first approach localizes subtrees in which significant population differences are found through hypothesis testing and sparse classifiers…

Applications · Statistics 2014-10-10 Nina Amenta , Manasi Datar , Asger Dirksen , Marleen de Bruijne , Aasa Feragen , Xiaoyin Ge , Jesper Holst Pedersen , Marylesa Howard , Megan Owen , Jens Petersen , Jie Shi , Qiuping Xu

Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees

Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty…

Machine Learning · Computer Science 2018-11-20 Myriam Tami , Marianne Clausel , Emilie Devijver , Adrien Dulac , Eric Gaussier , Stefan Janaqi , Meriam Chebre

Uncharted Forest a Technique for Exploratory Data Analysis

Exploratory data analysis is crucial for developing and understanding classification models from high-dimensional datasets. We explore the utility of a new unsupervised tree ensemble called uncharted forest for visualizing class…

Machine Learning · Statistics 2018-07-03 Casey Kneale , Steven D. Brown

Variable selection and sensitivity analysis using dynamic trees, with an application to computer code performance tuning

We investigate an application in the automatic tuning of computer codes, an area of research that has come to prominence alongside the recent rise of distributed scientific processing and heterogeneity in high-performance computing…

Applications · Statistics 2013-04-17 Robert B. Gramacy , Matt Taddy , Stefan M. Wild

A Survey on Latent Tree Models and Applications

In data analysis, latent variables play a central role because they help provide powerful insights into a wide variety of phenomena, ranging from biological to human sciences. The latent tree model, a particular type of probabilistic…

Machine Learning · Computer Science 2014-02-05 Raphaël Mourad , Christine Sinoquet , Nevin L. Zhang , Tengfei Liu , Philippe Leray

Tree-Structured Modelling of Varying Coefficients

The varying-coefficient model is a strong tool for the modelling of interactions in generalized regression. It is easy to apply if both the variables that are modified as well as the effect modifiers are known. However, in general one has a…

Methodology · Statistics 2017-05-25 Moritz Berger , Gerhard Tutz , Matthias Schmid

FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data With Correlated Features

This paper proposes FREEtree, a tree-based method for high dimensional longitudinal data with correlated features. Popular machine learning approaches, like Random Forests, commonly used for variable selection do not perform well when there…

Machine Learning · Statistics 2020-06-18 Yuancheng Xu , Athanasse Zafirov , R. Michael Alvarez , Dan Kojis , Min Tan , Christina M. Ramirez

A Novel Tree Visualization to Guide Interactive Exploration of Multi-dimensional Topological Hierarchies

Understanding the response of an output variable to multi-dimensional inputs lies at the heart of many data exploration endeavours. Topology-based methods, in particular Morse theory and persistent homology, provide a useful framework for…

Graphics · Computer Science 2022-08-16 Yarden Livnat , Dan Maljovec , Attila Gyulassy , Dr Baptiste Mouginot , Valerio Pascucci

Latent Tree Analysis

Latent tree analysis seeks to model the correlations among a set of random variables using a tree of latent variables. It was proposed as an improvement to latent class analysis --- a method widely used in social sciences and medicine to…

Machine Learning · Computer Science 2016-10-04 Nevin L. Zhang , Leonard K. M. Poon

RSATree: Distribution-Aware Data Representation of Large-Scale Tabular Datasets for Flexible Visual Query

Analysts commonly investigate the data distributions derived from statistical aggregations of data that are represented by charts, such as histograms and binned scatterplots, to visualize and analyze a large-scale dataset. Aggregate queries…

Databases · Computer Science 2019-10-14 Honghui Mei , Wei Chen , Yating Wei , Yuanzhe Hu , Shuyue Zhou , Bingru Lin , Ying Zhao , Jiazhi Xia

Seeing the Forest through the Trees: Adaptive Local Exploration of Large Graphs

Visualization is a powerful paradigm for exploratory data analysis. Visualizing large graphs, however, often results in a meaningless hairball. In this paper, we propose a different approach that helps the user adaptively explore large…

Information Retrieval · Computer Science 2016-07-25 Robert Pienta , Zhiyuan Lin , Minsuk Kahng , Jilles Vreeken , Partha P. Talukdar , James Abello , Ganesh Parameswaran , Duen Horng Chau

visTree: Visualization of Subgroups for a Decision Tree

Decision trees are flexible prediction models which are constructed to quantify outcome-covariate relationships and characterize relevant population subgroups. However, the standard graphical representation of fitted decision trees…

Applications · Statistics 2021-03-09 Ashwini Venkatasubramaniam , Julian Wolfson

Interpreting Tree Ensembles with inTrees

Tree ensembles such as random forests and boosted trees are accurate but difficult to understand, debug and deploy. In this work, we provide the inTrees (interpretable trees) framework that extracts, measures, prunes and selects rules from…

Machine Learning · Computer Science 2014-08-26 Houtao Deng

Regression Trees Know Calculus

Regression trees have emerged as a preeminent tool for solving real-world regression problems due to their ability to deal with nonlinearities, interaction effects and sharp discontinuities. In this article, we rather study regression trees…

Machine Learning · Statistics 2025-11-14 Nathan Wycoff

Discussion of: Treelets--An adaptive multi-scale basis for sparse unordered data

We would like to congratulate Lee, Nadler and Wasserman on their contribution to clustering and data reduction methods for high $p$ and low $n$ situations. A composite of clustering and traditional principal components analysis, treelets is…

Applications · Statistics 2008-07-28 Catherine Tuglus , Mark J. van der Laan