Related papers: Combining haplotypers
In genetic studies, haplotype data provide more refined information than data about separate genetic markers. However, large-scale studies that genotype hundreds to thousands of individuals may only provide results of pooled data, where…
Haplotype Inference is a challenging problem in bioinformatics that consists in inferring the basic genetic constitution of diploid organisms on the basis of their genotype. This information allows researchers to perform association studies…
Background: Haplotypes, the ordered lists of single nucleotide variations that distinguish chromosomal sequences from their homologous pairs, may reveal an individual's susceptibility to hereditary and complex diseases and affect how our…
We consider the problem of detecting and estimating the strength of association between a trait of interest and alleles or haplotypes in a small genomic region (e.g. a gene or a gene complex), when no direct information on that region is…
Network inference is a rapidly advancing field, with new methods being proposed on a regular basis. Understanding the advantages and limitations of different network inference methods is key to their effective application in different…
The combination of multiple classifiers using ensemble methods is increasingly important for making progress in a variety of difficult prediction problems. We present a comparative analysis of several ensemble methods through two case…
In the genomic era, the identification of gene signatures associated with disease is of significant interest. Such signatures are often used to predict clinical outcomes in new patients and aid clinical decision-making. However, recent…
Combining p-values from independent statistical tests is a popular approach to meta-analysis, particularly when the data underlying the tests are either no longer available or are difficult to combine. A diverse range of p-value combination…
This paper examines the use of a hierarchical coevolutionary genetic algorithm under different partnering strategies. Cascading clusters of sub-populations are built from the bottom up, with higher-level sub-populations optimising larger…
The perennial problem of "how many clusters?" remains an issue of substantial interest in data mining and machine learning communities, and becomes particularly salient in large data sets such as populational genomic data where the number…
We consider statistical procedures for hypothesis testing of real valued functionals of matched pairs with missing values. In order to improve the accuracy of existing methods, we propose a novel multiplication combination procedure.…
DNA samples are often pooled, either by experimental design, or because the sample itself is a mixture. For example, when population allele frequencies are of primary interest, individual samples may be pooled together to lower the cost of…
Binary observations are often repeated to improve data quality, creating technical replicates. Several scoring methods are commonly used to infer the actual individual state and obtain a probability for each state. The common practice of…
The usage of machine learning methods in traditional surveys including official statistics, is still very limited. Therefore, we propose a predictor supported by these algorithms, which can be used to predict any population or subpopulation…
A method is discussed that allows combining sets of differential or inclusive measurements. It is assumed that at least one measurement was obtained with simultaneously fitting a set of nuisance parameters, representing sources of…
Supervised and unsupervised homography estimation methods depend on image pairs tailored to specific modalities to achieve high accuracy. However, their performance deteriorates substantially when applied to unseen modalities. To address…
This issue includes six articles that develop and apply statistical methods for the analysis of gene sequencing data of different types. The methods are tailored to the different data types and, in each case, lead to biological insights not…
Single individual haplotyping is an NP-hard problem that emerges when attempting to reconstruct an organism's inherited genetic variations using data typically generated by high-throughput DNA sequencing platforms. Genomes of diploid…
In this paper we present a collection of results pertaining to haplotyping. The first set of results concerns the combinatorial problem of reconstructing haplotypes from incomplete and/or imperfectly sequenced haplotype data. More…
Replication helps ensure that a genotype-phenotype association observed in a genome-wide association (GWA) study represents a credible association and is not a chance finding or an artifact due to uncontrolled biases. We discuss…