Related papers: Fusing heterogeneous data sets
In systems biology, it is becoming increasingly common to measure biochemical entities at different levels of the same biological system. Hence, data fusion problems are abundant in the life sciences. With the availability of a multitude of…
Multiple technologies that measure expression levels of protein mixtures in the human body offer a potential for detection and understanding the disease. The recent increase of these technologies prompts researchers to evaluate the…
Multimodal data modeling has emerged as a powerful approach in clinical research, enabling the integration of diverse data types such as imaging, genomics, wearable sensors, and electronic health records. Despite its potential to improve…
Metabolomics is becoming a mature part of analytical chemistry as evidenced by the growing number of publications and attendees of international conferences dedicated to this topic. Yet, a systematic treatment of the fundamental structure…
In many areas of science multiple sets of data are collected pertaining to the same system. Examples are food products which are characterized by different sets of variables, bio-processes which are on-line sampled with different…
New technologies have enabled the investigation of biology and human health at an unprecedented scale and in multiple dimensions. These dimensions include a myriad of properties describing genome, epigenome, transcriptome, microbiome,…
As metabolomics datasets are becoming larger and more complex, there is an increasing need for model-based data integration and analysis to optimally leverage these data. Dynamical models of metabolism allow for the integration of…
The availability of multi-omics data has revolutionized the life sciences by creating avenues for integrated system-level approaches. Data integration links the information across datasets to better understand the underlying biological…
In the real world, most objects and data have multiple types of attributes and inter-connections. Such data structures are named "Heterogeneous Information Networks" (HIN) and have been widely researched. Biological systems are also…
We consider the fusion of two aerodynamic data sets originating from differing fidelity physical or computer experiments. We specifically address the fusion of: 1) noisy and in-complete fields from wind tunnel measurements and 2)…
Recent advances in single cell sequencing and multi-omics techniques have significantly improved our understanding of biological phenomena and our capacity to model them. Despite combined capture of data modalities showing similar progress,…
Background. Emerging technologies now allow for mass spectrometry based profiling of up to thousands of small molecule metabolites (metabolomics) in an increasing number of biosamples. While offering great promise for revealing insight into…
A major challenge in nuclear fusion research is the coherent combination of data from heterogeneous diagnostics and modelling codes for machine control and safety as well as physics studies. Measured data from different diagnostics often…
A distinguishing feature of a multicellular living system is that it operates at various scales, from the intracellular to organismal. Very little is known at present on how tissue level properties are related to cell and subcellular…
This paper considers the two-dataset problem, where data are collected from two potentially different populations sharing common aspects. This problem arises when data are collected by two different types of researchers or from two…
A proper fusion of complex data is of interest to many researchers in diverse fields, including computational statistics, computational geometry, bioinformatics, machine learning, pattern recognition, quality management, engineering,…
Removing the bias and variance of multicentre data has always been a challenge in large scale digital healthcare studies, which requires the ability to integrate clinical features extracted from data acquired by different scanners and…
Mass spectrometry-based metabolomic analysis depends upon the identification of spectral peaks by their mass and retention time. Statistical analysis that follows the identification currently relies on one main peak of each compound.…
Data is of high quality if it is fit for its intended use. The quality of data is influenced by the underlying data model and its quality. One major quality problem is the heterogeneity of data as quality aspects such as understandability…
The integration of data from multiple sources is increasingly used to achieve larger sample sizes and enhance population diversity. Our previous work established that, under random sampling from the same underlying population, integrating…