English
Related papers

Related papers: Fusing heterogeneous data sets

200 papers

Hyperspectral imaging is a powerful technology that is plagued by large dimensionality. Herein, we explore a way to combat that hindrance via non-contiguous and contiguous (simpler to realize sensor) band grouping for dimensionality…

Image and Video Processing · Electrical Eng. & Systems 2019-05-31 Muhammad Aminul Islam , Derek T. Anderson , John E. Ball , Nicolas H. Younan

Phenotypic variation is a hallmark of cellular physiology. Metabolic heterogeneity, in particular, underpins single-cell phenomena such as microbial drug tolerance and growth variability. Much research has focussed on transcriptomic and…

Molecular Networks · Quantitative Biology 2019-01-31 Mona K. Tonn , Philipp Thomas , Mauricio Barahona , Diego A Oyarzún

We provide new algorithms for two tasks relating to heterogeneous tabular datasets: clustering, and synthetic data generation. Tabular datasets typically consist of heterogeneous data types (numerical, ordinal, categorical) in columns, but…

Machine Learning · Computer Science 2024-04-22 Chandrani Kumari , Rahul Siddharthan

Integrative learning of multiple datasets has the potential to mitigate the challenge of small $n$ and large $p$ that is often encountered in analysis of big biomedical data such as genomics data. Detection of weak yet important signals can…

Methodology · Statistics 2022-07-04 Changgee Chang , Zongyu Dai , Jihwan Oh , Qi Long

Increasingly sophisticated experiments, coupled with large-scale computational models, have the potential to systematically test biological hypotheses to drive our understanding of multicellular systems. In this short review, we explore key…

Quantitative Methods · Quantitative Biology 2019-10-28 Paul Macklin

Metabolic heterogeneity is widely recognised as the next challenge in our understanding of non-genetic variation. A growing body of evidence suggests that metabolic heterogeneity may result from the inherent stochasticity of intracellular…

Molecular Networks · Quantitative Biology 2020-10-08 Mona K Tonn , Philipp Thomas , Mauricio Barahona , Diego A Oyarzún

Biclustering is an unsupervised machine-learning approach aiming to cluster rows and columns simultaneously in a data matrix. Several biclustering algorithms have been proposed for handling numeric datasets. However, real-world data mining…

Machine Learning · Computer Science 2024-08-26 Adán José-García , Julie Jacques , Clément Chauvet , Vincent Sobanski , Clarisse Dhaenens

To estimate accurately the parameters of a regression model, the sample size must be large enough relative to the number of possible predictors for the model. In practice, sufficient data is often lacking, which can lead to overfitting of…

Applications · Statistics 2024-09-25 Marianne A Jonker , Hassan Pazira , Anthony CC Coolen

The problem of information fusion from multiple data-sets acquired by multimodal sensors has drawn significant research attention over the years. In this paper, we focus on a particular problem setting consisting of a physical phenomenon or…

Machine Learning · Statistics 2018-11-21 Ori Katz , Ronen Talmon , Yu-Lun Lo , Hau-Tieng Wu

For most problems in science and engineering we can obtain data sets that describe the observed system from various perspectives and record the behavior of its individual components. Heterogeneous data sets can be collectively mined by data…

Machine Learning · Computer Science 2015-02-09 Marinka Žitnik , Blaž Zupan

In multicenter biomedical research, integrating data from multiple decentralized sites provides more robust and generalizable findings due to its larger sample size and the ability to account for the between-site heterogeneity. However,…

Methodology · Statistics 2025-12-29 Xiaokang Liu , Yuchen Yang , Yifei Sun , Jiang Bian , Yanyuan Ma , Raymond J. Carroll , Yong Chen

One of the purposes of Big Data systems is to support analysis of data gathered from heterogeneous data sources. Since data warehouses have been used for several decades to achieve the same goal, they could be leveraged also to provide…

Databases · Computer Science 2018-09-13 Darja Solodovnikova , Laila Niedrite

We revisit the modeling of the diauxic growth of a pure microorganism on two distinct sugars which was first described by Monod. Most available models are deterministic and make the assumption that all cells of the microbial ecosystem…

Populations and Evolution · Quantitative Biology 2020-05-19 Carl Graham , Jérôme Harmand , Sylvie Méléard , Josué Tchouanti

Entity matching (EM) is a fundamental task in data integration and analytics, essential for identifying records that refer to the same real-world entity across diverse sources. In practice, datasets often differ widely in structure, format,…

Databases · Computer Science 2026-02-09 Mohammad Hossein Moslemi , Amir Mousavi , Behshid Behkamal , Mostafa Milani

Mass-spectrometry technologies are widely used in the fields of ionomics and metabolomics to simultaneously profile at the genome scale intracellular concentrations of e.g. amino acids or elements. Short profiles of molecular or…

Molecular Networks · Quantitative Biology 2020-11-12 Jacopo Iacovacci , Alina Peluso , Timothy Ebbels , Markus Ralser , Robert Charles Glen

Evaluating whether data streams are drawn from the same distribution is at the heart of various machine learning problems. This is particularly relevant for data generated by dynamical systems since such systems are essential for many…

Change detection in heterogeneous multitemporal satellite images is a challenging and still not much studied topic in remote sensing and earth observation. This paper focuses on comparison of image pairs covering the same geographical area…

Computer Vision and Pattern Recognition · Computer Science 2017-02-13 Luigi Tommaso Luppino , Stian Normann Anfinsen , Gabriele Moser , Robert Jenssen , Filippo Maria Bianchi , Sebastiano Serpico , Gregoire Mercier

A consistent theme in software experimentation at Microsoft has been solving problems of experimentation at scale for a diverse set of products. Running experiments at scale (i.e., many experiments on many users) has become state of the art…

Applications · Statistics 2019-12-03 Craig Boucher , Ulf Knoblich , Daniel Miller , Sasha Patotski , Amin Saied , Venky Venkateshaiah

Envelope model also known as multivariate regression model was proposed to solve the multiple response regression problems. It measures the linear association between predictors and multiple responses by using the minimal reducing subspace…

Methodology · Statistics 2018-05-07 Bochao Jia

Humans make accurate decisions by interpreting complex data from multiple sources. Medical diagnostics, in particular, often hinge on human interpretation of multi-modal information. In order for artificial intelligence to make progress in…

Computer Vision and Pattern Recognition · Computer Science 2018-11-20 Faisal Mahmood , Ziyun Yang , Thomas Ashley , Nicholas J. Durr