Related papers: The Weight Function in the Subtree Kernel is Decis…
The kernel method is a potential approach to analyzing structured data such as sequences, trees, and graphs; however, unordered trees have not been investigated extensively. Kimura et al. (2011) proposed a kernel function for unordered…
While balancing covariates between groups is central for observational causal inference, selecting which features to balance remains a challenging problem. Kernel balancing is a promising approach that first estimates a kernel that captures…
We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis, which allows to leverage representation learning and multiple splitting rules at the node level, reducing…
Graph is an usual representation of relational data, which are ubiquitous in manydomains such as molecules, biological and social networks. A popular approach to learningwith graph structured data is to make use of graph kernels, which…
Tree kernels have been proposed to be used in many areas as the automatic learning of natural language applications. In this paper, we propose a new linear time algorithm based on the concept of weighted tree automata for SubTree kernel…
Tree kernels are fundamental tools that have been leveraged in many applications, particularly those based on machine learning for Natural Language Processing tasks. In this paper, we devise a parallel implementation of the sequential…
In supervised learning, the output variable to be predicted is often represented as a function, such as a spectrum or probability distribution. Despite its importance, functional output regression remains relatively unexplored. In this…
Random Forest (Breiman, 2001) is a successful and widely used regression and classification algorithm. Part of its appeal and reason for its versatility is its (implicit) construction of a kernel-type weighting function on training data,…
Tree kernels have demonstrated their ability to deal with hierarchical data, as the intrinsic tree structure often plays a discriminative role. While such kernels have been successfully applied to various domains such as nature language…
Clustered data, which arise when observations are nested within groups, are incredibly common in clinical, education, and social science research. Traditionally, a linear mixed model, which includes random effects to account for…
The increasing complexity of data requires methods and models that can effectively handle intricate structures, as simplifying them would result in loss of information. While several analytical tools have been developed to work with complex…
Feature importance aims at measuring how crucial each input feature is for model prediction. It is widely used in feature engineering, model selection and explainable artificial intelligence (XAI). In this paper, we propose a new tree-model…
Improvement of statistical learning models in order to increase efficiency in solving classification or regression problems is still a goal pursued by the scientific community. In this way, the support vector machine model is one of the…
Nowadays, feature selection is frequently used in machine learning when there is a risk of performance degradation due to overfitting or when computational resources are limited. During the feature selection process, the subset of features…
A soft tree is an actively studied variant of a decision tree that updates splitting rules using the gradient method. Although soft trees can take various architectures, their impact is not theoretically well known. In this paper, we…
Decision trees are popular machine learning models that are simple to build and easy to interpret. Even though algorithms to learn decision trees date back to almost 50 years, key properties affecting their generalization error are still…
A random forest prediction can be computed by the scalar product of the labels of the training examples and a set of weights that are determined by the leafs of the forest into which the test object falls; each prediction can hence be…
In deep neural networks, better results can often be obtained by increasing the complexity of previously developed basic models. However, it is unclear whether there is a way to boost performance by decreasing the complexity of such models.…
In deep neural networks, better results can often be obtained by increasing the complexity of previously developed basic models. However, it is unclear whether there is a way to boost performance by decreasing the complexity of such models.…
Measuring the complexity of tree structures can be beneficial in areas that use tree data structures for storage, communication, and processing purposes. This complexity can then be used to compress tree data structures to their…