English
Related papers

Related papers: Managing Complex Structured Data In a Fast Evolvin…

200 papers

Cross-border access to a variety of data such as market information, strategic information, or customer-related information defines the daily business of many global companies, including financial institutions. These companies are obliged…

Cryptography and Security · Computer Science 2010-06-24 Michael Stieghahn , Thomas Engel

Large language models have shown unprecedented abilities in generating linguistically coherent and syntactically correct natural language output. However, they often return incorrect and inconsistent answers to input questions. Due to the…

Databases · Computer Science 2023-12-27 Jasmin Mousavi , Arash Termehchy

Traditionally the integration of data from multiple sources is done on an ad-hoc basis for each analysis scenario and application. This is a solution that is inflexible, incurs in high costs, leads to "silos" that prevent sharing data…

Large language models (LLMs) have shown impressive performance on general-purpose tasks, yet adapting them to specific domains remains challenging due to the scarcity of high-quality domain data. Existing data synthesis tools often struggle…

Computation and Language · Computer Science 2025-07-08 Ziyang Miao , Qiyu Sun , Jingyuan Wang , Yuchen Gong , Yaowei Zheng , Shiqi Li , Richong Zhang

Designing effective data manipulation methods is a long standing problem in data lakes. Traditional methods, which rely on rules or machine learning models, require extensive human efforts on training data collection and tuning models.…

Artificial Intelligence · Computer Science 2024-05-13 Yichen Qian , Yongyi He , Rong Zhu , Jintao Huang , Zhijian Ma , Haibin Wang , Yaohua Wang , Xiuyu Sun , Defu Lian , Bolin Ding , Jingren Zhou

Structured data offers a sophisticated mechanism for the organization of information. Existing methodologies for the text-serialization of structured data in the context of large language models fail to adequately address the heterogeneity…

Computation and Language · Computer Science 2024-02-20 YiQiu Guo , Yuchen Yang , Ya Zhang , Yu Wang , Yanfeng Wang

Intentional or unintentional leakage of confidential data is undoubtedly one of the most severe security threats that organizations face in the digital era. The threat now extends to our personal lives: a plethora of personal information is…

Cryptography and Security · Computer Science 2014-08-06 Michael Backes , Niklas Grimm , Aniket Kate

In today world we are confronted with increasing amounts of information every day coming from a large variety of sources. People and co-operations are producing data on a large scale, and since the rise of the internet, e-mail and social…

Information Retrieval · Computer Science 2016-09-06 Maarten Banerveld , Nhien-An Le-Khac , Tahar Kechadi

Ensuring compliance with international data protection standards for privacy and data security is a crucial but complex task, often requiring substantial legal expertise. This paper introduces LegiLM, a novel legal language model…

Computation and Language · Computer Science 2024-09-24 Linkai Zhu , Lu Yang , Chaofan Li , Shanwen Hu , Lu Liu , Bin Yin

A long standing goal of the data management community is to develop general, automated systems that ingest semi-structured documents and output queryable tables without human effort or domain specific customization. Given the sheer variety…

Computation and Language · Computer Science 2025-03-10 Simran Arora , Brandon Yang , Sabri Eyuboglu , Avanika Narayan , Andrew Hojel , Immanuel Trummer , Christopher Ré

Nowadays, many decision support applications need to exploit data that are not only numerical or symbolic, but also multimedia, multistructure, multisource, multimodal, and/or multiversion. We term such data complex data. Managing and…

Databases · Computer Science 2007-07-12 Jérôme Darmont , Omar Boussaid , Jean-Christian Ralaivao , Kamel Aouiche

Large language models (LLMs) are increasingly applied in fields such as finance, education, and governance due to their ability to generate human-like text and adapt to specialized tasks. However, their widespread adoption raises critical…

Cryptography and Security · Computer Science 2025-05-26 Yu Wang , Cailing Cai , Zhihua Xiao , Peifung E. Lam

Data complexity is an important concept in the natural sciences and related areas, but lacks a rigorous and computable definition. In this paper, we focus on a particular sense of complexity that is high if the data is structured in a way…

Computer Vision and Pattern Recognition · Computer Science 2025-03-21 Louis Mahon

The data model of an application, the nature and format of data stored across executions, is typically a very rigid part of its early specification, even when prototyping, and changing it after code that relies on it was written can prove…

Software Engineering · Computer Science 2008-02-26 Pierre Thierry , Simon E. B. Thierry

Programming with logic for sophisticated applications must deal with recursion and negation, which together have created significant challenges in logic, leading to many different, conflicting semantics of rules. This paper describes a…

Logic in Computer Science · Computer Science 2021-10-07 Yanhong A. Liu , Scott D. Stoller

In the rapidly evolving field of legal analytics, finding relevant cases and accurately predicting judicial outcomes are challenging because of the complexity of legal language, which often includes specialized terminology, complex syntax,…

Computation and Language · Computer Science 2024-08-01 Dong Shu , Haoran Zhao , Xukun Liu , David Demeter , Mengnan Du , Yongfeng Zhang

Our goal is to build classification models using a combination of free-text and structured data. To do this, we represent structured data by text sentences, DataWords, so that similar data items are mapped into the same sentence. This…

Machine Learning · Computer Science 2022-02-18 Stephen I. Gallant , Mirza Nasir Hossain

With web and mobile platforms becoming more prominent devices utilized in data analysis, there are currently few systems which are not without flaw. In order to increase the performance of these systems and decrease errors of data…

Databases · Computer Science 2022-05-03 Daniel Szelogowski

Strings are ubiquitous in code. Not all strings are created equal, some contain structure that makes them incompatible with other strings. CSS units are an obvious example. Worse, type checkers cannot see this structure: this is the latent…

Programming Languages · Computer Science 2019-04-26 David Kelly , Mark Marron , David Clark , Earl T. Barr

Although large language models (LLMs) have advanced the state-of-the-art in NLP significantly, deploying them for downstream applications is still challenging due to cost, responsiveness, control, or concerns around privacy and security. As…

Computation and Language · Computer Science 2023-11-01 Dong-Ho Lee , Jay Pujara , Mohit Sewak , Ryen W. White , Sujay Kumar Jauhar
‹ Prev 1 2 3 10 Next ›