Related papers: How Many Genes Are Needed for a Discriminant Micro…

Algorithm for Finding Optimal Gene Sets in Microarray Prediction

Motivation: Microarray data has been recently been shown to be efficacious in distinguishing closely related cell types that often appear in the diagnosis of cancer. It is useful to determine the minimum number of genes needed to do such a…

Biological Physics · Physics 2007-05-23 J. M. Deutsch

Extreme Value Distribution Based Gene Selection Criteria for Discriminant Microarray Data Analysis Using Logistic Regression

One important issue commonly encountered in the analysis of microarray data is to decide which and how many genes should be selected for further studies. For discriminant microarray data analyses based on statistical models, such as the…

Quantitative Methods · Quantitative Biology 2009-11-09 Wentian Li , Fengzhu Sun , Ivo Grosse

Gene selection for cancer classification using a hybrid of univariate and multivariate feature selection methods

Various approaches to gene selection for cancer classification based on microarray data can be found in the literature and they may be grouped into two categories: univariate methods and multivariate methods. Univariate methods look at each…

Quantitative Methods · Quantitative Biology 2015-06-18 Min Xu , Rudy Setiono

A Comparative Analysis of Gene Expression Profiling by Statistical and Machine Learning Approaches

Many machine learning models have been proposed to classify phenotypes from gene expression data. In addition to their good performance, these models can potentially provide some understanding of phenotypes by extracting explanations for…

Genomics · Quantitative Biology 2024-02-05 Myriam Bontonou , Anaïs Haget , Maria Boulougouri , Benjamin Audit , Pierre Borgnat , Jean-Michel Arbona

A Regularized Method for Selecting Nested Groups of Relevant Genes from Microarray Data

Gene expression analysis aims at identifying the genes able to accurately predict biological parameters like, for example, disease subtyping or progression. While accurate prediction can be achieved by means of many different techniques,…

Methodology · Statistics 2008-09-11 Christine De Mol , Sofia Mosci , Magali Traskine , Alessandro Verri

Reliably determining which genes have a high posterior probability of differential expression: A microarray application of decision-theoretic multiple testing

Microarray data are often used to determine which genes are differentially expressed between groups, for example, between treatment and control groups. There are methods of determining which genes have a high probability of differential…

Quantitative Methods · Quantitative Biology 2007-05-23 David R. Bickel

Bayesian Variable Selection for Probit Mixed Models Applied to Gene Selection

In computational biology, gene expression datasets are characterized by very few individual samples compared to a large number of measurements per sample. Thus, it is appealing to merge these datasets in order to increase the number of…

Methodology · Statistics 2011-08-18 Meili Baragatti

A Novel Anticlustering Filtering Algorithm for the Prediction of Genes as a Drug Target

The high-throughput data generated by microarray experiments provides complete set of genes being expressed in a given cell or in an organism under particular conditions. The analysis of these enormous data has opened a new dimension for…

Computational Engineering, Finance, and Science · Computer Science 2012-11-12 Khalid Raza , Akhilesh Mishra

Exploiting the Accumulated Evidence for Gene Selection in Microarray Gene Expression Data

Machine Learning methods have of late made significant efforts to solving multidisciplinary problems in the field of cancer classification using microarray gene expression data. Feature subset selection methods can play an important role in…

Computational Engineering, Finance, and Science · Computer Science 2013-03-04 G. Prat , Ll. Belanche

Global Gene Expression Analysis Using Machine Learning Methods

Microarray is a technology to quantitatively monitor the expression of large number of genes in parallel. It has become one of the main tools for global gene expression analysis in molecular biology research in recent years. The large…

Quantitative Methods · Quantitative Biology 2015-06-18 Min Xu

Sample Size Planning for Classification Models

In biospectroscopy, suitably annotated and statistically independent samples (e. g. patients, batches, etc.) for classifier training and testing are scarce and costly. Learning curves show the model performance as function of the training…

Applications · Statistics 2015-05-05 Claudia Beleites , Ute Neugebauer , Thomas Bocklitz , Christoph Krafft , Jürgen Popp

A Comprehensive Evaluation of Machine Learning Techniques for Cancer Class Prediction Based on Microarray Data

Prostate cancer is among the most common cancer in males and its heterogeneity is well known. Its early detection helps making therapeutic decision. There is no standard technique or procedure yet which is full-proof in predicting cancer…

Machine Learning · Computer Science 2018-12-18 Khalid Raza , Atif N Hasan

A new parsimonious method for classifying Cancer Tissue-of-Origin Based on DNA Methylation 450K data

DNA methylation is a well-studied genetic modification that regulates gene transcription of Eukaryotes. Its alternations have been recognized as a significant component of cancer development. In this study, we use the DNA methylation 450k…

Tissues and Organs · Quantitative Biology 2021-01-05 Shen Jia , Yulin Zhang , Yiming Mao , Jiawei Gao , Yixuan Chen , Yuxuan Jiang , Haochen Luo , Kebo Lv , Jionglong Su

The efficacy of various machine learning models for multi-class classification of RNA-seq expression data

Late diagnosis and high costs are key factors that negatively impact the care of cancer patients worldwide. Although the availability of biological markers for the diagnosis of cancer type is increasing, costs and reliability of tests…

Machine Learning · Computer Science 2019-08-20 Sterling Ramroach , Melford John , Ajay Joshi

Diagnosis of Acute Myeloid Leukaemia Using Machine Learning

We train a machine learning model on a dataset of 2177 individuals using as features 26 probe sets and their age in order to classify if someone has acute myeloid leukaemia or is healthy. The dataset is multicentric and consists of data…

Machine Learning · Computer Science 2021-08-18 A. Angelakis , I. Soulioti

Sparse linear discriminant analysis by thresholding for high dimensional data

In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is…

Statistics Theory · Mathematics 2011-05-19 Jun Shao , Yazhen Wang , Xinwei Deng , Sijian Wang

Finding Important Genes from High-Dimensional Data: An Appraisal of Statistical Tests and Machine-Learning Approaches

Over the past decades, statisticians and machine-learning researchers have developed literally thousands of new tools for the reduction of high-dimensional data in order to identify the variables most responsible for a particular trait.…

Machine Learning · Statistics 2012-05-31 Chamont Wang , Jana Gevertz , Chaur-Chin Chen , Leonardo Auslender

Logistic regression models for patient-level prediction based on massive observational data: Do we need all data?

Objective: Provide guidance on sample size considerations for developing predictive models by empirically establishing the adequate sample size, which balances the competing objectives of improving model performance and reducing model…

Applications · Statistics 2024-07-25 Luis H. John , Jan A. Kors , Jenna M. Reps , Patrick B. Ryan , Peter R. Rijnbeek

Meta-Learning on Augmented Gene Expression Profiles for Enhanced Lung Cancer Detection

Gene expression profiles obtained through DNA microarray have proven successful in providing critical information for cancer detection classifiers. However, the limited number of samples in these datasets poses a challenge to employ complex…

Machine Learning · Computer Science 2024-08-20 Arya Hadizadeh Moghaddam , Mohsen Nayebi Kerdabadi , Cuncong Zhong , Zijun Yao

A deep generative model for gene expression profiles from single-cell RNA sequencing

We propose a probabilistic model for interpreting gene expression levels that are observed through single-cell RNA sequencing. In the model, each cell has a low-dimensional latent representation. Additional latent variables account for…

Machine Learning · Computer Science 2018-01-18 Romain Lopez , Jeffrey Regier , Michael Cole , Michael Jordan , Nir Yosef