English
Related papers

Related papers: Fusing heterogeneous data sets

200 papers

In this paper, we present an information-theoretic method for clustering mixed-type data, that is, data consisting of both continuous and categorical variables. The proposed approach extends the Information Bottleneck principle to…

Methodology · Statistics 2026-02-02 Efthymios Costa , Ioanna Papatsouma , Angelos Markos

Cellular heterogeneity is an immanent property of biological systems that covers very different aspects of life ranging from genetic diversity to cell-to-cell variability driven by stochastic molecular interactions, and noise induced cell…

Quantitative Methods · Quantitative Biology 2021-09-15 Niko Komin , Alexander Skupin

The rapid advancement of high-throughput sequencing and other assay technologies has resulted in the generation of large and complex multi-omics datasets, offering unprecedented opportunities for advancing precision medicine strategies.…

Quantitative Methods · Quantitative Biology 2025-01-30 Ana R. Baião , Zhaoxiang Cai , Rebecca C Poulos , Phillip J. Robinson , Roger R Reddel , Qing Zhong , Susana Vinga , Emanuel Gonçalves

This article proposes a powerful scheme to monitor a large number of categorical data streams with heterogeneous parameters or nature. The data streams considered may be either nominal with a number of attribute levels or ordinal with some…

Methodology · Statistics 2021-12-17 Kaizong Bai , Jian Li

The scarcity of well-annotated medical datasets requires leveraging transfer learning from broader datasets like ImageNet or pre-trained models like CLIP. Model soups averages multiple fine-tuned models aiming to improve performance on…

Computer Vision and Pattern Recognition · Computer Science 2024-06-04 Santosh Sanjeev , Nuren Zhaksylyk , Ibrahim Almakky , Anees Ur Rehman Hashmi , Mohammad Areeb Qazi , Mohammad Yaqub

Quantitative evidence synthesis methods aim to combine data from multiple medical trials to infer relative effects of different interventions. A challenge arises when trials report continuous outcomes on different measurement scales. To…

Methodology · Statistics 2024-02-29 Annabel L Davies , A E Ades , Julian PT Higgins

Data comes in many forms. From a shallow perspective, they can be viewed as being either in structured (e.g., as a relation, as key-value pairs) or unstructured (e.g., text, image) formats. So far, machines have been fairly good at…

Computation and Language · Computer Science 2026-03-31 Md Ataur Rahman , Dimitris Sacharidis , Oscar Romero , Sergi Nadal

One of the major research questions regarding human microbiome studies is the feasibility of designing interventions that modulate the composition of the microbiome to promote health and cure disease. This requires extensive understanding…

Methodology · Statistics 2021-11-18 Matthew D. Koslovsky , Kristi L. Hoffman , Carrie R. Daniel , Marina Vannucci

Multiple sets of measurements on the same objects obtained from different platforms may reflect partially complementary information of the studied system. The integrative analysis of such data sets not only provides us with the opportunity…

Methodology · Statistics 2020-10-15 Yipeng Song , Johan A. Westerhuis , Age K. Smilde

Entity information network is used to describe structural relationships between entities. Taking advantage of its extension and heterogeneity, entity information network is more and more widely applied to relationship modeling. Recent…

Information Retrieval · Computer Science 2017-10-11 Liang Yin , Li-Chen Shi , Jun-Yan Zhao , Song-Yang Du , Wen-Bo Xie , Duan-Bing Chen

In recent years, machine learning has demonstrated impressive capability in handling molecular science tasks. To support various molecular properties at scale, machine learning models are trained in the multi-task learning paradigm.…

Machine Learning · Computer Science 2024-10-15 Yuxuan Ren , Dihan Zheng , Chang Liu , Peiran Jin , Yu Shi , Lin Huang , Jiyan He , Shengjie Luo , Tao Qin , Tie-Yan Liu

Nowadays, journalism is facilitated by the existence of large amounts of digital data sources, including many Open Data ones. Such data sources are extremely heterogeneous, ranging from highly struc-tured (relational databases),…

This paper proposes a new approach to multi-sensor data fusion. It suggests that aggregation of data from multiple sensors can be done more efficiently when we consider information about sensors' different characteristics. Similar to most…

Systems and Control · Electrical Eng. & Systems 2019-09-10 Mohammad Amin Ahmad Akhoundi , Ehsan Valavi

Many data sets contain an inherent multilevel structure, for example, because of repeated measurements of the same observational units. Taking this structure into account is critical for the accuracy and calibration of any statistical…

Methodology · Statistics 2020-05-07 Topi Paananen , Alejandro Catalina , Paul-Christian Bürkner , Aki Vehtari

Multimodal fusion focuses on integrating information from multiple modalities with the goal of more accurate prediction, which has achieved remarkable progress in a wide range of scenarios, including autonomous driving and medical…

Machine Learning · Computer Science 2024-11-04 Qingyang Zhang , Yake Wei , Zongbo Han , Huazhu Fu , Xi Peng , Cheng Deng , Qinghua Hu , Cai Xu , Jie Wen , Di Hu , Changqing Zhang

Suppose that we are interested in the comparison of two independent categorical variables. Suppose also that the population is divided into subpopulations or groups. Notice that the distribution of the target variable may vary across…

Methodology · Statistics 2024-05-08 M. V. Alba-Fernández , M. D. Jiménez--Gamero , F. J. Ariza-López

Recently developed technologies to generate single-cell genomic data have made a revolutionary impact in the field of biology. Multi-omics assays offer even greater opportunities to understand cellular states and biological processes.…

Genomics · Quantitative Biology 2022-01-19 Stefan Stanojevic , Yijun Li , Lana X. Garmire

We study stochastic particle systems made up of heterogeneous units. We introduce a general framework suitable to analytically study this kind of systems and apply it to two particular models of interest in economy and epidemiology. We show…

Soft Condensed Matter · Physics 2013-02-06 Luis F. Lafuerza , Raul Toral

Modern TEM instrumentation can probe a wide range of structural, optical, and chemical properties with unprecedented resolution. However, each of these properties must be recorded in independent datasets using different detector modes with…

Materials Science · Physics 2023-01-16 Thomas Thersleff , Cheuk-Wai Tai

Genetic data are frequently categorical and have complex dependence structures that are not always well understood. For this reason, clustering and classification based on genetic data, while highly relevant, are challenging statistical…

Methodology · Statistics 2016-06-13 Gabriela Bettella Cybis , Marcio Valk , Silvia Regina Costa Lopes