Related papers: Better Aggregation in Test-Time Augmentation
Test-time augmentation -- the aggregation of predictions across transformed examples of test inputs -- is an established technique to improve the performance of image classification models. Importantly, TTA can be used to improve model…
Data augmentation has been actively studied for robust neural networks. Most of the recent data augmentation methods focus on augmenting datasets during the training phase. At the testing phase, simple transformations are still widely used…
Data augmentation is known to contribute significantly to the robustness of machine learning models. In most instances, data augmentation is utilized during the training phase. Test-Time Augmentation (TTA) is a technique that instead…
Test-time augmentation (TTA)--aggregating predictions over multiple augmented copies of a test input--is widely assumed to improve classification accuracy, particularly in medical imaging where it is routinely deployed in production systems…
The popularity of data augmentation techniques in machine learning has increased in recent years, as they enable the creation of new samples from existing datasets. Rotational augmentation, in particular, has shown great promise by…
A conformal classifier produces a set of predicted classes and provides a probabilistic guarantee that the set includes the true class. Unfortunately, it is often the case that conformal classifiers produce uninformatively large sets. In…
Test-Time Augmentation (TTA) is a very powerful heuristic that takes advantage of data augmentation during testing to produce averaged output. Despite the experimental effectiveness of TTA, there is insufficient discussion of its…
Test-time augmentation (TTA) is a well-known technique employed during the testing phase of computer vision tasks. It involves aggregating multiple augmented versions of input data. Combining predictions using a simple average formulation…
In this study, we introduce an intelligent Test Time Augmentation (TTA) algorithm designed to enhance the robustness and accuracy of image classification models against viewpoint variations. Unlike traditional TTA methods that…
In this paper, we propose Test-Time Training, a general approach for improving the performance of predictive models when training and test data come from different distributions. We turn a single unlabeled test sample into a self-supervised…
We consider the problem of improving the human instance segmentation mask quality for a given test image using keypoints estimation. We compare two alternative approaches. The first approach is a test-time adaptation (TTA) method, where we…
Averaging predictions over a set of models -- an ensemble -- is widely used to improve predictive performance and uncertainty estimation of deep learning models. At the same time, many machine learning systems, such as search, matching, and…
Factual probing is a method that uses prompts to test if a language model "knows" certain world knowledge facts. A problem in factual probing is that small changes to the prompt can lead to large changes in model output. Previous work aimed…
Modern deep neural networks can produce badly calibrated predictions, especially when train and test distributions are mismatched. Training an ensemble of models and averaging their predictions can help alleviate these issues. We propose a…
Despite the state-of-the-art performance for medical image segmentation, deep convolutional neural networks (CNNs) have rarely provided uncertainty estimations regarding their segmentation outputs, e.g., model (epistemic) and image-based…
Data augmentation is a widely used technique in many machine learning tasks, such as image classification, to virtually enlarge the training dataset size and avoid overfitting. Traditional data augmentation techniques for image…
Data augmentation methods have been shown to be a fundamental technique to improve generalization in tasks such as image, text and audio classification. Recently, automated augmentation methods have led to further improvements on image…
Ensembles of artificial neural networks show improved generalization capabilities that outperform those of single networks. However, for aggregation to be effective, the individual networks must be as accurate and diverse as possible. An…
Ensemble methods which average over multiple neural network predictions are a simple approach to improve a model's calibration and robustness. Similarly, data augmentation techniques, which encode prior information in the form of invariant…
Gathering observational data for medical decision-making often involves uncertainties arising from both type I (false positive)and type II (false negative) errors. In this work, we develop a statistical model to study how medical…