Related papers: Deep Learning for Efficient GWAS Feature Selection
Traditional GWAS has advanced our understanding of complex diseases but often misses nonlinear genetic interactions. Deep learning offers new opportunities to capture complex genomic patterns, yet existing methods mostly depend on feature…
To understand how genetic variants in human genomes manifest in phenotypes -- traits like height or diseases like asthma -- geneticists have sequenced and measured hundreds of thousands of individuals. Geneticists use this data to build…
Through genome-wide association studies (GWAS), disease susceptible genetic variables can be identified by comparing the genetic data of individuals with and without a specific disease. However, the discovery of these associations poses a…
A genome-wide association study (GWAS) correlates marker variation with trait variation in a sample of individuals. Each study subject is genotyped at a multitude of SNPs (single nucleotide polymorphisms) spanning the genome. Here we assume…
Genome Wide Association Studies (GWAS) are used to identify statistically significant genetic variants in case-control studies. GWAS typically use a p-value threshold of 5 x 10-8 to identify highly ranked single nucleotide polymorphisms…
In this paper, association results from genome-wide association studies (GWAS) are combined with a deep learning framework to test the predictive capacity of statistically significant single nucleotide polymorphism (SNPs) associated with…
Genome-wide association studies (GWAS) have identified hundreds of loci at very stringent levels of statistical significance across many different human traits. However, it is now clear that very large samples (n~10^4-10^5) are needed to…
Feature selection, as a critical pre-processing step for machine learning, aims at determining representative predictors from a high-dimensional feature space dataset to improve the prediction accuracy. However, the increase in feature…
Disease-gene association through Genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms (SNPs) that correlate with specific diseases needs statistical analysis of associations.…
For the vast majority of genome wide association studies (GWAS) published so far, statistical analysis was performed by testing markers individually. In this article we present some elementary statistical considerations which clearly show…
The applications of traditional statistical feature selection methods to high-dimension, low sample-size data often struggle and encounter challenging problems, such as overfitting, curse of dimensionality, computational infeasibility, and…
Although genome-wide association studies (GWAS) have proven powerful for comprehending the genetic architecture of complex traits, they are challenged by a high dimension of single-nucleotide polymorphisms (SNPs) as predictors, the presence…
The aetiology of polygenic obesity is multifactorial, which indicates that life-style and environmental factors may influence multiples genes to aggravate this disorder. Several low-risk single nucleotide polymorphisms (SNPs) have been…
Gene expression data represents a unique challenge in predictive model building, because of the small number of samples $(n)$ compared to the huge amount of features $(p)$. This "$n<<p$" property has hampered application of deep learning…
Genome-wide association studies (GWAS) have achieved great success in the genetic study of Alzheimer's disease (AD). Collaborative imaging genetics studies across different research institutions show the effectiveness of detecting genetic…
Motivation: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with…
Understanding the genetic basis of complex traits is a longstanding challenge in the field of genomics. Genome-wide association studies (GWAS) have identified thousands of variant-trait associations, but most of these variants are located…
Biological data including gene expression data are generally high-dimensional and require efficient, generalizable, and scalable machine-learning methods to discover their complex nonlinear patterns. The recent advances in machine learning…
High-dimensional data in many machine learning applications leads to computational and analytical complexities. Feature selection provides an effective way for solving these problems by removing irrelevant and redundant features, thus…
In this paper, we present a novel approach to accelerate the Bayesian inference process, focusing specifically on the nested sampling algorithms. Bayesian inference plays a crucial role in cosmological parameter estimation, providing a…