Related papers: Fusing heterogeneous data sets
In many studies of human diseases, multiple omic datasets are measured. Typically, these omic datasets are studied one by one with the disease, thus the relationship between omics are overlooked. Modeling the joint part of multiple omics…
In post genomic era with the advent of new technologies a huge amount of complex molecular data are generated with high throughput. The management of this biological data is definitely a challenging task due to complexity and heterogeneity…
There are many issues that can cause problems when attempting to infer model parameters from data. Data and models are both imperfect, and as such there are multiple scenarios in which standard methods of inference will lead to misleading…
In many applications, data can be heterogeneous in the sense of spanning latent groups with different underlying distributions. When predictive models are applied to such data the heterogeneity can affect both predictive performance and…
In many applications involving multi-media data, the definition of similarity between items is integral to several key tasks, e.g., nearest-neighbor retrieval, classification, and recommendation. Data in such regimes typically exhibits…
The generalisation of Neural Networks (NN) to multiple datasets is often overlooked in literature due to NNs typically being optimised for specific data sources. This becomes especially challenging in time-series-based multi-dataset models…
The modeling of the spreading of communicable diseases has experienced significant advances in the last two decades or so. This has been possible due to the proliferation of data and the development of new methods to gather, mine and…
Biometrics is the science and technology of measuring and analyzing biological data of human body, extracting a feature set from the acquired data, and comparing this set against to the template set in the database. Experimental studies…
Statistical matching aims to integrate two statistical sources. These sources can be two samples or a sample and the entire population. If two samples have been selected from the same population and information has been collected on…
Systems biology approaches to the integrative study of cells, organs and organisms offer the best means of understanding in a holistic manner the diversity of molecular assays that can be now be implemented in a high throughput manner. Such…
In human microbiome studies, sequencing reads data are often summarized as counts of bacterial taxa at various taxonomic levels specified by a taxonomic tree. This paper considers the problem of analyzing two repeated measurements of…
Data analysis based on information from several sources is common in economic and biomedical studies. This setting is often referred to as the data fusion problem, which differs from traditional missing data problems since no complete data…
Often in surveys, key items are subject to measurement errors. Given just the data, it can be difficult to determine the distribution of this error process, and hence to obtain accurate inferences that involve the error-prone variables. In…
Heterogeneous datasets emerge in various machine learning and optimization applications that feature different input sources, types or formats. Most models or methods do not natively tackle heterogeneity. Hence, such datasets are often…
The macroscopic behavior of the solution of a coupled system of partial differential equations arising in the modeling of reaction-diffusion processes in periodic porous media is analyzed. Our mathematical model can be used for studying…
Combining the results of different search engines in order to improve upon their performance has been the subject of many research papers. This has become known as the "Data Fusion" task, and has great promise in dealing with the vast…
Exponential increases in scientific experimental data are outstripping the rate of progress in silicon technology. As a result, heterogeneous combinations of architectures and process or device technologies are increasingly important to…
Heterogeneity is a hallmark of complex diseases. Regression-based heterogeneity analysis, which is directly concerned with outcome-feature relationships, has led to a deeper understanding of disease biology. Such an analysis identifies the…
Methods for quantifying the similarity of datasets are relevant in applications where two or more datasets, or their underlying distributions, need to be compared, ranging from two- and k-sample testing to applications in machine learning…
Heterogeneous systems of active matter exhibit a range of complex emergent dynamical patterns. In particular, it is difficult to predict the properties of the mixed system based on its constituents. These considerations are particularly…