English
Related papers

Related papers: Schema Integration on Massive Data Sources

200 papers

In Big data era, information integration often requires abundant data extracted from massive data sources. Due to a large number of data sources, data source selection plays a crucial role in information integration, since it is costly and…

Databases · Computer Science 2016-11-01 Yiming Lin , Hongzhi Wang , Jianzhong Li , Hong Gao

An applied problem facing all areas of data science is harmonizing data sources. Joining data from multiple origins with unmapped and only partially overlapping features is a prerequisite to developing and testing robust, generalizable…

Data integration is one of the main problems in distributed data sources. An approach is to provide an integrated mediated schema for various data sources. This research work aims at developing a framework for defining an integrated schema…

Databases · Computer Science 2012-11-28 Amineh Amini , Hadi Saboohi , Nasser Nemat bakhsh

Data is the king in the age of AI. However data integration is often a laborious task that is hard to automate. Schema change is one significant obstacle to the automation of the end-to-end data integration process. Although there exist…

Databases · Computer Science 2020-10-16 Zijie Wang , Lixi Zhou , Amitabh Das , Valay Dave , Zhanpeng Jin , Jia Zou

Schema and data integration have been a challenge for more than 40 years. While data warehouse technologies are quite a success story, there is still a lack of information integration methods, especially if the data sources are based on…

Databases · Computer Science 2021-07-21 Fritz Laux , Malcolm Crowe

Schema Matching is a method of finding attributes that are either similar to each other linguistically or represent the same information. In this project, we take a hybrid approach at solving this problem by making use of both the provided…

Databases · Computer Science 2020-04-22 Tanvi Sahay , Ankita Mehta , Shruti Jadon

Relational data sources are still one of the most popular ways to store enterprise or Web data, however, the issue with relational schema is the lack of a well-defined semantic description. A common ontology provides a way to represent the…

Machine Learning · Computer Science 2018-01-31 Natalia Ruemmele , Yuriy Tyshetskiy , Alex Collins

Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in their original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving. Thus data…

Databases · Computer Science 2018-02-05 Sergi Nadal , Oscar Romero , Alberto Abelló , Panos Vassiliadis , Stijn Vansummeren

Schema matching is a crucial task in data integration, involving the alignment of a source schema with a target schema to establish correspondence between their elements. This task is challenging due to textual and semantic heterogeneity,…

Databases · Computer Science 2024-05-31 Eitam Sheetrit , Menachem Brief , Moshik Mishaeli , Oren Elisha

The amount of data in the world is expanding rapidly. Every day, huge amounts of data are created by scientific experiments, companies, and end users' activities. These large data sets have been labeled as "Big Data", and their storage,…

Databases · Computer Science 2020-04-29 Mahdi Bohlouli , Frank Schulz , Lefteris Angelis , David Pahor , Ivona Brandic , David Atlan , Rosemary Tate

The growing need to integrate information from a large number of diverse sources poses significant scalability challenges for data integration systems. These systems often rely on manually written schema mappings, which are complex,…

Databases · Computer Science 2025-06-02 Christopher Buss , Mahdis Safari , Arash Termehchy , Stefan Lee , David Maier

Since data is often stored in different sources, it needs to be integrated to gather a global view that is required in order to create value and derive knowledge from it. A critical step in data integration is schema matching which aims to…

Databases · Computer Science 2022-03-10 Benjamin Hättasch , Michael Truong-Ngoc , Andreas Schmidt , Carsten Binnig

Schema evolution is critical in managing database systems to ensure compatibility across different data versions. A schema registry typically addresses the challenges of schema evolution in real-time data streaming by managing, validating,…

Databases · Computer Science 2024-06-18 Silvery D. Fu , Xuewei Chen

Using data warehouses to analyse multidimensional data is a significant task in company decision-making.The data warehouse merging process is composed of two steps: matching multidimensional components and then merging them. Current…

Databases · Computer Science 2021-07-27 Yuzhao Yang , Jérôme Darmont , Franck Ravat , Olivier Teste

Bioinformatics research is characterized by voluminous and incremental datasets and complex data analytics methods. The machine learning methods used in bioinformatics are iterative and parallel. These methods can be scaled to handle big…

Computational Engineering, Finance, and Science · Computer Science 2015-06-17 Hirak Kashyap , Hasin Afzal Ahmed , Nazrul Hoque , Swarup Roy , Dhruba Kumar Bhattacharyya

The ability to collect and analyze large amounts of data is a growing problem within the scientific community. The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity…

Databases · Computer Science 2016-08-01 Vijay Gadepally , Jeremy Kepner

Data collection is a major bottleneck in machine learning and an active research topic in multiple communities. There are largely two reasons data collection has recently become a critical issue. First, as machine learning is becoming more…

Machine Learning · Computer Science 2019-08-13 Yuji Roh , Geon Heo , Steven Euijong Whang

Linked Data have emerged as a successful publication format and one of its main strengths is its fitness for integration of data from multiple sources. This gives them a great potential both for semantic applications and the enterprise…

Databases · Computer Science 2014-10-30 Jan Michelfeit , Tomáš Knap , Martin Nečaský

In practical data integration systems, it is common for the data sources being integrated to provide conflicting information about the same entity. Consequently, a major challenge for data integration is to derive the most complete and…

Databases · Computer Science 2012-03-05 Bo Zhao , Benjamin I. P. Rubinstein , Jim Gemmell , Jiawei Han

One of the challenging problems in the multidatabase systems is to find the most viable solution to the problem of interoperability of distributed heterogeneous autonomous local component databases. This has resulted in the creation of a…

Databases · Computer Science 2009-12-04 Mohammad Ghulam Ali
‹ Prev 1 2 3 10 Next ›