Related papers: Network cross-validation by edge sampling
In the network literature, a wide range of statistical models has been proposed to exploit structural patterns in the data. Therefore, model selection between different models is a fundamental problem. However, there remains a lack of…
Complex and larger networks are becoming increasingly prevalent in scientific applications in various domains. Although a number of models and methods exist for such networks, cross-validation on networks remains challenging due to the…
Cross-validation is one of the most popular model selection methods in statistics and machine learning. Despite its wide applicability, traditional cross validation methods tend to select overfitting models, due to the ignorance of the…
Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its apparent universality. Many results exist on the model selection performances of…
Cross-validation is a popular non-parametric method for evaluating the accuracy of a predictive rule. The usefulness of cross-validation depends on the task we want to employ it for. In this note, I discuss a simple non-parametric setting,…
In a regression model, prediction is typically performed after model selection. The large variability in the model selection makes the prediction unstable. Thus, it is essential to reduce the variability in model selection and improve…
Cross-validation is frequently used for model selection in a variety of applications. However, it is difficult to apply cross-validation to mixed effects models (including nonlinear mixed effects models or NLME models) due to the fact that…
Super learner algorithm can be applied to combine results of multiple base learners to improve quality of predictions. The default method for verification of super learner results is by nested cross validation. It has been proposed by…
A popular data-driven method for choosing the bandwidth in standard kernel regression is cross-validation. Even when there are outliers in the data, robust kernel regression can be used to estimate the unknown regression curve [Robust and…
Cross-validation is the standard approach for tuning parameter selection in many non-parametric regression problems. However its use is less common in change-point regression, perhaps as its prediction error-based criterion may appear to…
Correlation networks derived from multivariate data appear in many applications across the sciences. These networks are usually dense and require sparsification to detect meaningful structure. However, current methods for sparsifying…
Cross-validation is a widely-used technique to estimate prediction error, but its behavior is complex and not fully understood. Ideally, one would like to think that cross-validation estimates the prediction error for the model at hand, fit…
Network sampling is a crucial technique for analyzing large or partially observable networks. However, the effectiveness of different sampling methods can vary significantly depending on the context. In this study, we empirically compare…
The decision to incorporate cross-validation into validation processes of mathematical models raises an immediate question - how should one partition the data into calibration and validation sets? We answer this question systematically: we…
Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational…
Cross validation is commonly used for selecting tuning parameters in penalized regression, but its use in penalized Cox regression models has received relatively little attention in the literature. Due to its partial likelihood…
This paper considers the problem of model selection under domain shift. Motivated by principles from distributionally robust optimisation and domain adaptation theory, it is proposed that the training-validation split should maximise the…
When selecting a classification algorithm to be applied to a particular problem, one has to simultaneously select the best algorithm for that dataset \emph{and} the best set of hyperparameters for the chosen model. The usual approach is to…
Reconstructing weighted networks from partial information is necessary in many important circumstances, e.g. for a correct estimation of systemic risk. It has been shown that, in order to achieve an accurate reconstruction, it is crucial to…
Cross-validation plays a fundamental role in Machine Learning, enabling robust evaluation of model performance and preventing overestimation on training and validation data. However, one of its drawbacks is the potential to create data…