Related papers: Multi-TGDR: a regularization method for multi-clas…
This paper investigates methods for improving generative data augmentation for deep learning. Generative data augmentation leverages the synthetic samples produced by generative models as an additional dataset for classification with small…
We propose a modification of linear discriminant analysis, referred to as compressive regularized discriminant analysis (CRDA), for analysis of high-dimensional datasets. CRDA is specially designed for feature elimination purpose and can be…
Normalization is a critical step in quantitative analyses of biological processes. Recent works show that cross-platform integration and normalization enable machine learning (ML) training on RNA microarray and RNA-seq data, but no…
Gene expression analysis aims at identifying the genes able to accurately predict biological parameters like, for example, disease subtyping or progression. While accurate prediction can be achieved by means of many different techniques,…
As the development of neural networks, more and more deep neural networks are adopted in various tasks, such as image classification. However, as the huge computational overhead, these networks could not be applied on mobile devices or…
We propose MCGrad, a novel and scalable multicalibration algorithm. Multicalibration - calibration in subgroups of the data - is an important property for the performance of machine learning-based systems. Existing multicalibration methods…
CDR (Cross-Domain Recommendation), i.e., leveraging information from multiple domains, is a critical solution to data sparsity problem in recommendation system. The majority of previous research either focused on single-target CDR (STCDR)…
We propose a novel distance-based regularization method for deep metric learning called Multi-level Distance Regularization (MDR). MDR explicitly disturbs a learning procedure by regularizing pairwise distances between embedding vectors…
In scientific machine learning, regression networks have been recently applied to approximate solution maps (e.g., potential-ground state map of Schr\"odinger equation). In this paper, we aim to reduce the generalization error without…
In genetical genomics studies, it is important to jointly analyze gene expression data and genetic variants in exploring their associations with complex traits, where the dimensionality of gene expressions and genetic variants can both be…
With the growing attention on learning-to-learn new tasks using only a few examples, meta-learning has been widely used in numerous problems such as few-shot classification, reinforcement learning, and domain generalization. However,…
Sufficient dimension reduction (SDR) methods, which often rely on class precision matrices, are widely used in supervised statistical classification problems. However, when class-specific sample sizes are small relative to the original…
Machine Learning (ML) algorithms have been used for assessing soil quality parameters along with non-destructive methodologies. Among spectroscopic analytical methodologies, energy dispersive X-ray fluorescence (EDXRF) is one of the more…
Multivariate time-series data in numerous real-world applications (e.g., healthcare and industry) are informative but challenging due to the lack of labels and high dimensionality. Recent studies in self-supervised learning have shown their…
To generalize the model trained in source domains to unseen target domains, domain generalization (DG) has recently attracted lots of attention. Since target domains can not be involved in training, overfitting source domains is inevitable.…
Probabilistic models must be well calibrated to support reliable decision-making. While calibration in single-output regression is well studied, defining and achieving multivariate calibration in multi-output regression remains considerably…
Machine learning models trained with \emph{stochastic} gradient descent (SGD) can generalize better than those trained with deterministic gradient descent (GD). In this work, we study SGD's impact on generalization through the lens of the…
How to improve discriminative feature learning is central in classification. Existing works address this problem by explicitly increasing inter-class separability and intra-class similarity, whether by constructing positive and negative…
We propose \textit{Meta-Regularization}, a novel approach for the adaptive choice of the learning rate in first-order gradient descent methods. Our approach modifies the objective function by adding a regularization term on the learning…
Dataset distillation compresses large datasets into compact synthetic ones to reduce storage and computational costs. Among various approaches, distribution matching (DM)-based methods have attracted attention for their high efficiency.…