Related papers: Classification of arrayCGH data using a fused SVM
Motivation: Array Comparative Genomic Hybridization (aCGH) is used to scan the entire genome for variations in DNA copy number. A central task in the analysis of aCGH data is the segmentation into groups of probes sharing the same DNA copy…
Array comparative genomic hybridization(CGH) is a high resolution technique to assess DNA copy number variation. Identifying breakpoints where copy number changes will enhance the understanding of the pathogenesis of human diseases, such as…
Array-Based Comparative Genomic Hybridization (aCGH) is a method used to search for genomic regions with copy numbers variations. For a given aCGH profile, one challenge is to accurately segment it into regions of constant copy number.…
Several modern genomic technologies, such as DNA-Methylation arrays, measure spatially registered probes that number in the hundreds of thousands across multiplechromosomes. The measured probes are by themselves less interesting…
We propose a new approach for clustering DNA features using array CGH data from multiple tumor samples. We distinguish data-collapsing: joining contiguous DNA clones or probes with extremely similar data into regions, from clustering:…
The development of cancer is largely driven by the gain or loss of subsets of the genome, promoting uncontrolled growth or disabling defenses against it. Identifying genomic regions whose DNA copy number deviates from the normal is…
A number of statistical models have been successfully developed for the analysis of high-throughput data from a single source, but few methods are available for integrating data from different sources. Here we focus on integrating gene…
Extracting genetic information from a full range of sequencing data is important for understanding diseases. We propose a novel method to effectively explore the landscape of genetic mutations and aggregate them to predict cancer type. We…
The DNA microarray technology has modernized the approach of biology research in such a way that scientists can now measure the expression levels of thousands of genes simultaneously in a single experiment. Gene expression profiles, which…
One of the notable fields in studying the genetics of cancer is disease gene identification which affects disease treatment and drug discovery. Many researches have been done in this field. Genome-wide association studies (GWAS) are one of…
The integration of knowledge graphs and graph machine learning (GML) in genomic data analysis offers several opportunities for understanding complex genetic relationships, especially at the RNA level. We present a comprehensive approach for…
Various approaches to gene selection for cancer classification based on microarray data can be found in the literature and they may be grouped into two categories: univariate methods and multivariate methods. Univariate methods look at each…
Clustering has long been a popular unsupervised learning approach to identify groups of similar objects and discover patterns from unlabeled data in many applications. Yet, coming up with meaningful interpretations of the estimated clusters…
In cancer research, high-throughput profiling has been extensively conducted. In recent studies, the integrative analysis of data on multiple cancer patient groups/subgroups has been conducted. Such analysis has the potential to reveal the…
Identifying relationships between molecular variations and their clinical presentations has been challenged by the heterogeneous causes of a disease. It is imperative to unveil the relationship between the high dimensional molecular…
In recent years the importance of finding a meaningful pattern from huge datasets has become more challenging. Data miners try to adopt innovative methods to face this problem by applying feature selection methods. In this paper we propose…
We propose a Classification Via Clustering (CVC) algorithm which enables existing clustering methods to be efficiently employed in classification problems. In CVC, training and test data are co-clustered and class-cluster distributions are…
Heterogeneity is a fundamental characteristic of cancer. To accommodate heterogeneity, subgroup identification has been extensively studied and broadly categorized into unsupervised and supervised analysis. Compared to unsupervised…
In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 33…
The support vector machine (SVM) and deep learning (e.g., convolutional neural networks (CNNs)) are the two most famous algorithms in small and big data, respectively. Nonetheless, smaller datasets may be very important, costly, and not…