Related papers: How Many Genes Are Needed for a Discriminant Micro…
Motivation: Microarray data has been recently been shown to be efficacious in distinguishing closely related cell types that often appear in the diagnosis of cancer. It is useful to determine the minimum number of genes needed to do such a…
One important issue commonly encountered in the analysis of microarray data is to decide which and how many genes should be selected for further studies. For discriminant microarray data analyses based on statistical models, such as the…
Various approaches to gene selection for cancer classification based on microarray data can be found in the literature and they may be grouped into two categories: univariate methods and multivariate methods. Univariate methods look at each…
Many machine learning models have been proposed to classify phenotypes from gene expression data. In addition to their good performance, these models can potentially provide some understanding of phenotypes by extracting explanations for…
Gene expression analysis aims at identifying the genes able to accurately predict biological parameters like, for example, disease subtyping or progression. While accurate prediction can be achieved by means of many different techniques,…
Microarray data are often used to determine which genes are differentially expressed between groups, for example, between treatment and control groups. There are methods of determining which genes have a high probability of differential…
In computational biology, gene expression datasets are characterized by very few individual samples compared to a large number of measurements per sample. Thus, it is appealing to merge these datasets in order to increase the number of…
The high-throughput data generated by microarray experiments provides complete set of genes being expressed in a given cell or in an organism under particular conditions. The analysis of these enormous data has opened a new dimension for…
Machine Learning methods have of late made significant efforts to solving multidisciplinary problems in the field of cancer classification using microarray gene expression data. Feature subset selection methods can play an important role in…
Microarray is a technology to quantitatively monitor the expression of large number of genes in parallel. It has become one of the main tools for global gene expression analysis in molecular biology research in recent years. The large…
In biospectroscopy, suitably annotated and statistically independent samples (e. g. patients, batches, etc.) for classifier training and testing are scarce and costly. Learning curves show the model performance as function of the training…
Prostate cancer is among the most common cancer in males and its heterogeneity is well known. Its early detection helps making therapeutic decision. There is no standard technique or procedure yet which is full-proof in predicting cancer…
DNA methylation is a well-studied genetic modification that regulates gene transcription of Eukaryotes. Its alternations have been recognized as a significant component of cancer development. In this study, we use the DNA methylation 450k…
Late diagnosis and high costs are key factors that negatively impact the care of cancer patients worldwide. Although the availability of biological markers for the diagnosis of cancer type is increasing, costs and reliability of tests…
We train a machine learning model on a dataset of 2177 individuals using as features 26 probe sets and their age in order to classify if someone has acute myeloid leukaemia or is healthy. The dataset is multicentric and consists of data…
In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is…
Over the past decades, statisticians and machine-learning researchers have developed literally thousands of new tools for the reduction of high-dimensional data in order to identify the variables most responsible for a particular trait.…
Objective: Provide guidance on sample size considerations for developing predictive models by empirically establishing the adequate sample size, which balances the competing objectives of improving model performance and reducing model…
Gene expression profiles obtained through DNA microarray have proven successful in providing critical information for cancer detection classifiers. However, the limited number of samples in these datasets poses a challenge to employ complex…
We propose a probabilistic model for interpreting gene expression levels that are observed through single-cell RNA sequencing. In the model, each cell has a low-dimensional latent representation. Additional latent variables account for…