Related papers: Treelets--An adaptive multi-scale basis for sparse…
This is a discussion of paper "Treelets--An adaptive multi-scale basis for sparse unordered data" [arXiv:0707.0481] by Ann B. Lee, Boaz Nadler and Larry Wasserman. In this paper the authors defined a new type of dimension reduction…
We congratulate Lee, Nadler and Wasserman (henceforth LNW) on a very interesting paper on new methodology and supporting theory [arXiv:0707.0481]. Treelets seem to tackle two important problems of modern data analysis at once. For datasets…
We would like to congratulate Lee, Nadler and Wasserman on their contribution to clustering and data reduction methods for high $p$ and low $n$ situations. A composite of clustering and traditional principal components analysis, treelets is…
Discussion of "Treelets--An adaptive multi-Scale basis for sparse unordered data" [arXiv:0707.0481]
Discussion of "Treelets--An adaptive multi-scale basis for sparse unordered data" [arXiv:0707.0481]
Discussion of "Treelets--An adaptive multi-scale basis for sparse unordered data" [arXiv:0707.0481]
We consider the problem of sparse variable selection on high dimension heterogeneous data sets, which has been taking on renewed interest recently due to the growth of biological and medical data sets with complex, non-i.i.d. structures and…
We propose a new outline for adaptive dictionary learning methods for sparse encoding based on a hierarchical clustering of the training data. Through recursive application of a clustering method, the data is organized into a binary…
Sparse representations has shown to be a very powerful model for real world signals, and has enabled the development of applications with notable performance. Combined with the ability to learn a dictionary from signal examples,…
Ensembles of decision trees are a useful tool for obtaining for obtaining flexible estimates of regression functions. Examples of these methods include gradient boosted decision trees, random forests, and Bayesian CART. Two potential…
Variable trees are a new method for the exploration of discrete multivariate data. They display nested subsets and corresponding frequencies and percentages. Manual calculation of these quantities can be laborious, especially when there are…
This paper proposes FREEtree, a tree-based method for high dimensional longitudinal data with correlated features. Popular machine learning approaches, like Random Forests, commonly used for variable selection do not perform well when there…
Wavelet trees are widely used in the representation of sequences, permutations, text collections, binary relations, discrete points, and other succinct data structures. We show, however, that this still falls short of exploiting all of the…
In data analysis, latent variables play a central role because they help provide powerful insights into a wide variety of phenomena, ranging from biological to human sciences. The latent tree model, a particular type of probabilistic…
Dynamic regression trees are an attractive option for automatic regression and classification with complicated response surfaces in on-line application settings. We create a sequential tree model whose state changes in time with the…
We consider the analysis of high dimensional data given in the form of a matrix with columns consisting of observations and rows consisting of features. Often the data is such that the observations do not reside on a regular grid, and the…
Machine learning problems involving sparse datasets may benefit from the use of convolutional neural networks if the numbers of samples and features are very large. Such datasets are increasingly more frequently encountered in a variety of…
Tree-structured models are a powerful alternative to parametric regression models if non-linear effects and interactions are present in the data. Yet, classical tree-structured models might not be appropriate if data comes in clusters of…
Dealing with datasets of very high dimension is a major challenge in machine learning. In this paper, we consider the problem of feature selection in applications where the memory is not large enough to contain all features. In this…
Several structural learning algorithms for staged tree models, an asymmetric extension of Bayesian networks, have been defined. However, they do not scale efficiently as the number of variables considered increases. Here we introduce the…