Related papers: Data Shapes and Data Transformations
Data comes in many forms. From a shallow perspective, they can be viewed as being either in structured (e.g., as a relation, as key-value pairs) or unstructured (e.g., text, image) formats. So far, machines have been fairly good at…
Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and…
Data heterogeneity is a prevalent issue, stemming from various conflicting factors, making its utilization complex. This uncertainty, particularly resulting from disparities in data formats, frequently necessitates the involvement of…
Tabular data comprising rows (samples) with the same set of columns (attributes, is one of the most widely used data-type among various industries, including financial services, health care, research, retail, and logistics, to name a few.…
The performance of machine learning models relies heavily on the quality of input data, yet real-world applications often face significant data-related challenges. A common issue arises when curating training data or deploying models: two…
Nowadays, journalism is facilitated by the existence of large amounts of digital data sources, including many Open Data ones. Such data sources are extremely heterogeneous, ranging from highly struc-tured (relational databases),…
Schema and data integration have been a challenge for more than 40 years. While data warehouse technologies are quite a success story, there is still a lack of information integration methods, especially if the data sources are based on…
In a data warehousing process, mastering the data preparation phase allows substantial gains in terms of time and performance when performing multidimensional analysis or using data mining algorithms. Furthermore, a data warehouse can…
One of the purposes of Big Data systems is to support analysis of data gathered from heterogeneous data sources. Since data warehouses have been used for several decades to achieve the same goal, they could be leveraged also to provide…
Heterogeneous tabular data are the most commonly used form of data and are essential for numerous critical and computationally demanding applications. On homogeneous data sets, deep neural networks have repeatedly shown excellent…
Spreadsheet tables are often labeled, and these labels effectively constitute types for the data in the table. In such cases tables can be considered to be built from typed data where the placement of values within the table is controlled…
The data warehouse (DW) technology was developed to integrate heterogeneous information sources for analysis purposes. Information sources are more and more autonomous and they often change their content due to perpetual transactions (data…
The article deals with the problem which led to Big Data. Big Data information technology is the set of methods and means of processing different types of structured and unstructured dynamic large amounts of data for their analysis and use…
Most real systems consist of a large number of interacting, multi-typed components, while most contemporary researches model them as homogeneous networks, without distinguishing different types of objects and links in the networks.…
Tables have gained significant attention in large language models (LLMs) and multimodal large language models (MLLMs) due to their complex and flexible structure. Unlike linear text inputs, tables are two-dimensional, encompassing formats…
The web of data has brought forth the need to preserve and sustain evolving information within linked datasets; however, a basic requirement of data preservation is the maintenance of the datasets' structural characteristics as well. As…
Tabular data is one of the most widely used formats across industries, driving critical applications in areas such as finance, healthcare, and marketing. In the era of data-centric AI, improving data quality and representation has become…
Database migration is a key task in software modernization, increasingly involving transformations across heterogeneous data models such as relational and NoSQL systems. Existing approaches are typically designed for specific source-target…
With systems for acquiring 3D surface data being evermore commonplace, it has become important to reliably extract specific shapes from the acquired data. In the presence of noise and occlusions, this can be done through the use of…
A traditional database systems is organized around a single data model that determines how data can be organized, stored and manipulated. But the vision of this paper is to develop new principles and techniques to manage multiple data…