English
Related papers

Related papers: Fusing heterogeneous data sets

200 papers

Data heterogeneity plays a pivotal role in determining the performance of machine learning (ML) systems. Traditional algorithms, which are typically designed to optimize average performance, often overlook the intrinsic diversity within…

Machine Learning · Computer Science 2025-06-03 Jiashuo Liu , Peng Cui

Routine clinical visits of a patient produce not only image data, but also non-image data containing clinical information regarding the patient, i.e., medical data is multi-modal in nature. Such heterogeneous modalities offer different and…

Computer Vision and Pattern Recognition · Computer Science 2023-03-16 Sein Kim , Namkyeong Lee , Junseok Lee , Dongmin Hyun , Chanyoung Park

In this paper, we conduct a simulation study with subject-level data to evaluate conventional meta-regression approaches (study-level random, fixed, and mixed effects) against seven methodology specifications new to meta-regressions that…

Econometrics · Economics 2025-07-18 Ali Habibnia , Jonathan Gendron

Deciphering cell type heterogeneity is crucial for systematically understanding tissue homeostasis and its dysregulation in diseases. Computational deconvolution is an efficient approach estimating cell type abundances from a variety of…

Other Quantitative Biology · Quantitative Biology 2023-09-06 Lana X. Garmire , Yijun Li , Qianhui Huang , Chuan Xu , Sarah Teichmann , Naftali Kaminski , Matteo Pellegrini , Quan Nguyen , Andrew E. Teschendorff

Tracking multiple time-varying states based on heterogeneous observations is a key problem in many applications. Here, we develop a statistical model and algorithm for tracking an unknown number of targets based on the probabilistic fusion…

Signal Processing · Electrical Eng. & Systems 2022-01-10 Domenico Gaglione , Paolo Braca , Giovanni Soldi , Florian Meyer , Franz Hlawatsch , Moe Z. Win

In this paper, we develop a graphical modeling framework for the inference of networks across multiple sample groups and data types. In medical studies, this setting arises whenever a set of subjects, which may be heterogeneous due to…

This study exploits information fusion in IoT systems and uses a clustering method to identify similarities in behaviours and key characteristics within each cluster. This approach facilitates early detection of behaviour changes and…

Computers and Society · Computer Science 2025-09-12 Mohsen Shirali , Zahra Ahmadi , Jose-Luis Bayo-Monton , Zoe Valero-Ramon , Carlos Fernandez-Llatas

Motivated by image-on-scalar regression with data aggregated across multiple sites, we consider a setting in which multiple independent studies each collect multiple dependent vector outcomes, with potential mean model parameter homogeneity…

Methodology · Statistics 2022-10-06 Emily C. Hector

Bioinformatics research is characterized by voluminous and incremental datasets and complex data analytics methods. The machine learning methods used in bioinformatics are iterative and parallel. These methods can be scaled to handle big…

Computational Engineering, Finance, and Science · Computer Science 2015-06-17 Hirak Kashyap , Hasin Afzal Ahmed , Nazrul Hoque , Swarup Roy , Dhruba Kumar Bhattacharyya

This paper evaluates heterogeneous information fusion using multi-task Gaussian processes in the context of geological resource modeling. Specifically, it empirically demonstrates that information integration across heterogeneous…

Machine Learning · Statistics 2013-09-06 Shrihari Vasudevan , Arman Melkumyan , Steven Scheding

Across biological subdisciplines, the last decade has seen an explosion of high-dimensional datasets, including datasets for cells, species, immune systems, neurons and behaviour. At the ICTS workshop 'Unifying Theories in High-Dimensional…

We propose a method called integrated diffusion for combining multimodal datasets, or data gathered via several different measurements on the same system, to create a joint data diffusion operator. As real world data suffers from both local…

Machine Learning · Computer Science 2022-03-07 Manik Kuchroo , Abhinav Godavarthi , Alexander Tong , Guy Wolf , Smita Krishnaswamy

This study analyzes the impact of heterogeneity ("Variety") in Big Data by comparing classification strategies across structured (Epsilon) and unstructured (Rest-Mex, IMDB) domains. A dual methodology was implemented: evolutionary and…

Estimating heterogeneous treatment effects is an important problem across many domains. In order to accurately estimate such treatment effects, one typically relies on data from observational studies or randomized experiments. Currently,…

Machine Learning · Statistics 2022-02-28 Tobias Hatt , Jeroen Berrevoets , Alicia Curth , Stefan Feuerriegel , Mihaela van der Schaar

Most biometric systems deployed in real-world applications are unimodal. Using unimodal biometric systems have to contend with a variety of problems such as: Noise in sensed data; Intra-class variations; Inter-class similarities;…

Computer Vision and Pattern Recognition · Computer Science 2015-06-11 Harbi AlMahafzah , Maen Zaid AlRwashdeh

Density level sets are mainly estimated using one of three methodologies: plug-in, excess mass, or a hybrid approach. The plug-in methods are based on replacing the unknown density by some nonparametric estimator, usually the kernel. Thus,…

High resolution microarrays and second-generation sequencing platforms are powerful tools to investigate genome-wide alterations in DNA copy number, methylation and gene expression associated with a disease. An integrated genomic profiling…

Applications · Statistics 2013-04-22 Ronglai Shen , Sijian Wang , Qianxing Mo

The problem addressed here is that of simultaneous treatment of several gene expression datasets, possibly collected under different experimental conditions and/or platforms. Using robust statistics, a large scale statistical analysis has…

Methodology · Statistics 2014-10-10 Bernard Ycart , Konstantina Charmpi , Sophie Rousseaux , Jean-Jacques Fournié

Multi-modal fusion approaches aim to integrate information from different data sources. Unlike natural datasets, such as in audio-visual applications, where samples consist of "paired" modalities, data in healthcare is often collected…

Image and Video Processing · Electrical Eng. & Systems 2023-03-03 Nasir Hayat , Krzysztof J. Geras , Farah E. Shamout

The combination of multiple classifiers using ensemble methods is increasingly important for making progress in a variety of difficult prediction problems. We present a comparative analysis of several ensemble methods through two case…

Machine Learning · Computer Science 2013-09-20 Sean Whalen , Gaurav Pandey
‹ Prev 1 4 5 6 7 8 10 Next ›