English
Related papers

Related papers: Fusing heterogeneous data sets

200 papers

We develop large sample theory for merged data from multiple sources. Main statistical issues treated in this paper are (1) the same unit potentially appears in multiple datasets from overlapping data sources, (2) duplicated items are not…

Statistics Theory · Mathematics 2018-05-22 Takumi Saegusa

Multi-view clustering leverages consistent and complementary information across multiple views to provide more comprehensive insights than single-view analysis. However, the heterogeneity and redundancy of multi-view data pose significant…

Optimization and Control · Mathematics 2025-08-12 Xiangru Xing , Yan Li , Xin Wang , Huangyue Chen , Xianchao Xiu

Functional data clustering is to identify heterogeneous morphological patterns in the continuous functions underlying the discrete measurements/observations. Application of functional data clustering has appeared in many publications across…

Methodology · Statistics 2022-10-04 Mimi Zhang , Andrew Parnell

Data deduplication is the task of detecting records in a database that correspond to the same real-world entity. Our goal is to develop a procedure that samples uniformly from the set of entities present in the database in the presence of…

Machine Learning · Computer Science 2020-08-25 Alireza Heidari , Shrinu Kushagra , Ihab F. Ilyas

Systems biology models are useful models of complex biological systems that may require a large amount of experimental data to fit each model's parameters or to approximate a likelihood function. These models range from a few to thousands…

Quantitative Methods · Quantitative Biology 2024-07-12 Vincent D. Zaballa , Elliot E. Hui

There is no consensus in the field of synthetic data on concise metrics for quality evaluations or benchmarks on large health datasets, such as historical epidemiological data. This study presents an evaluation of seven recent models from…

Machine Learning · Computer Science 2026-04-20 Jean-Baptiste Escudié , Benjamin Barnes , Stefan Meisegeier , Klaus Kraywinkel , Fabian Prasser , Nils Körber

It is well known that the integration among different data-sources is reliable because of its potential of unveiling new functionalities of the genomic expressions which might be dormant in a single source analysis. Moreover, different…

Methodology · Statistics 2021-12-08 Arnab Kumar Maity , Sang Chan Lee , Bani K. Mallick , Tapasree Roy Sarkar

This work presents an omics-driven modeling pipeline that integrates machine-learning tools to facilitate the dynamic modeling of multiscale biological systems. Random forests and permutation feature importance are proposed to mine omics…

Quantitative Methods · Quantitative Biology 2025-01-17 Sebastián Espinel-Ríos , José Montaño López , José L. Avalos

In the big data era, integrating diverse data modalities poses significant challenges, particularly in complex fields like healthcare. This paper introduces a new process model for multimodal Data Fusion for Data Mining, integrating…

Artificial Intelligence · Computer Science 2024-06-04 David Restrepo , Chenwei Wu , Constanza Vásquez-Venegas , Luis Filipe Nakayama , Leo Anthony Celi , Diego M López

The human-associated microbiome is closely tied to human health and is of substantial clinical interest. Metagenomics-based tools are emerging for clinical diagnostics, tracking the spread of diseases, and surveillance of potential…

Gene expression-based heterogeneity analysis has been extensively conducted. In recent studies, it has been shown that network-based analysis, which takes a system perspective and accommodates the interconnections among genes, can be more…

Methodology · Statistics 2023-08-09 Rong Li , Qingzhao Zhang , Shuangge Ma

The joint analysis of biomedical data in Alzheimer's Disease (AD) is important for better clinical diagnosis and to understand the relationship between biomarkers. However, jointly accounting for heterogeneous measures poses important…

Methodology · Statistics 2018-08-14 Luigi Antelmi , Nicholas Ayache , Philippe Robert , Marco Lorenzi

Heterogeneous networks play a key role in the evolution of communities and the decisions individuals make. These networks link different types of entities, for example, people and the events they attend. Network analysis algorithms usually…

Computers and Society · Computer Science 2016-11-17 Rumi Ghosh , Kristina Lerman

Large-scale data analysis poses both statistical and computational problems which need to be addressed simultaneously. A solution is often straightforward if the data are homogeneous: one can use classical ideas of subsampling and mean…

Methodology · Statistics 2014-09-10 Peter Bühlmann , Nicolai Meinshausen

There has been much research activity in recent times about providing the data infrastructures needed for the provision of personalised healthcare. In particular the requirement of integrating multiple, potentially distributed,…

Databases · Computer Science 2008-12-16 Andrew Branson , Tamas Hauer , Richard McClatchey , Dmitry Rogulin , Jetendr Shamdasani

The success of metabolomics studies depends upon the "fitness" of each biological sample used for analysis: it is critical that metabolite levels reported for a biological sample represent an accurate snapshot of the studied organism's…

Quantitative Methods · Quantitative Biology 2015-06-16 Barry M. Slaff , Shane T. Jensen , Aalim M. Weljie

Integrated analysis of multi-omics datasets holds great promise for uncovering complex biological processes. However, the large dimension of omics data poses significant interpretability and multiple testing challenges. Simultaneous…

Methodology · Statistics 2024-10-28 Mitra Ebrahimpoor , Renee Menezes , Ningning Xu , Jelle J. Goeman

The random coefficients model is an extension of the linear regression model that allows for unobserved heterogeneity in the population by modeling the regression coefficients as random variables. Given data from this model, the statistical…

Methodology · Statistics 2018-03-15 Fabian Dunker , Konstantin Eckle , Katharina Proksch , Johannes Schmidt-Hieber

Populations of heterogeneous cells play an important role in many biological systems. In this paper we consider systems where each cell can be modelled by an ordinary differential equation. To account for heterogeneity, parameter values are…

Quantitative Methods · Quantitative Biology 2009-09-27 Steffen Waldherr , Jan Hasenauer , Frank Allgöwer

Estimating how a treatment affects units individually, known as heterogeneous treatment effect (HTE) estimation, is an essential part of decision-making and policy implementation. The accumulation of large amounts of data in many domains,…

Machine Learning · Computer Science 2022-06-28 Christopher Tran , Elena Zheleva
‹ Prev 1 8 9 10 Next ›