Related papers: Discriminating sample groups with multi-way data
Modern data often take the form of a multiway array. However, most classification methods are designed for vectors, i.e., 1-way arrays. Distance weighted discrimination (DWD) is a popular high-dimensional classification method that has been…
Distance weighted discrimination (DWD) is a linear discrimination method that is particularly well-suited for classification tasks with high-dimensional data. The DWD coefficients minimize an intuitive objective function, which can solved…
A novel linear classification method that possesses the merits of both the Support Vector Machine (SVM) and the Distance-weighted Discrimination (DWD) is proposed in this article. The proposed Distance-weighted Support Vector Machine method…
Classification is an important topic in statistics and machine learning with great potential in many real applications. In this paper, we investigate two popular large margin classification methods, Support Vector Machine (SVM) and Distance…
Motivated by the challenge of using DNA-seq data to identify viruses in human blood samples, we propose a novel classification algorithm called "Radial Distance Weighted Discrimination" (or Radial DWD). This classifier is designed for…
High dimension low sample size statistical analysis is important in a wide range of applications. In such situations, the highly appealing discrimination method, support vector machine, can be improved to alleviate data piling at the…
Objective: Brain networks have gained increasing recognition as potential biomarkers in mental health studies, but there are limited approaches that can leverage complex brain networks for accurate classification. Our goal is to develop a…
In many modern data sets, High dimension low sample size (HDLSS) data is prevalent in many fields of studies. There has been an increased focus recently on using machine learning and statistical methods to mine valuable information out of…
In many applications, input data are sampled functions taking their values in infinite dimensional spaces rather than standard vectors. This fact has complex consequences on data analysis algorithms that motivate modifications of them. In…
High dimensional classification has been highlighted for last two decades and much research has been conducted in order to circumvent challenges encountered in high dimensions. While existing methods have focused mainly on developing…
Data series classification is an important and challenging problem in data science. Explaining the classification decisions by finding the discriminant parts of the input that led the algorithm to some decisions is a real need in many…
Analysis of high-dimensional data is currently a popular field of research, thanks to many applications e.g. in genetics (DNA data in genomewide association studies), spectrometry or web analysis. At the same time, the type of problems that…
This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading…
Datasets containing both categorical and continuous variables are frequently encountered in many areas, and with the rapid development of modern measurement technologies, the dimensions of these variables can be very high. Despite the…
We introduce a new method of performing high dimensional discriminant analysis, which we call multiDA. We achieve this by constructing a hybrid model that seamlessly integrates a multiclass diagonal discriminant analysis model and feature…
Low-dimensional embeddings for data from disparate sources play critical roles in multi-modal machine learning, multimedia information retrieval, and bioinformatics. In this paper, we propose a supervised dimensionality reduction method…
With the advancement of communication and security technologies, it has become crucial to have robustness of embedded biometric systems. This paper presents the realization of such technologies which demands reliable and error-free…
Medical studies frequently require to extract the relationship between each covariate and the outcome with statistical confidence measures. To do this, simple parametric models are frequently used (e.g. coefficients of linear regression)…
Huge amount of applications in various fields, such as gene expression analysis or computer vision, undergo data sets with high-dimensional low-sample-size (HDLSS), which has putted forward great challenges for standard statistical and…
Researchers in the behavioral and social sciences use linear discriminant analysis (LDA) for predictions of group membership (classification) and for identifying the variables most relevant to group separation among a set of continuous…