English
Related papers

Related papers: Towards Linear Algebra over Normalized Data

200 papers

Machine learning algorithms use error function minimization to fit a large set of parameters in a preexisting model. However, error minimization eventually leads to a memorization of the training dataset, losing the ability to generalize to…

Machine Learning · Computer Science 2018-03-16 Fernando Martin-Maroto , Gonzalo G. de Polavieja

Statistics and Optimization are foundational to modern Machine Learning. Here, we propose an alternative foundation based on Abstract Algebra, with mathematics that facilitates the analysis of learning. In this approach, the goal of the…

Machine Learning · Computer Science 2025-02-28 Fernando Martin-Maroto , Nabil Abderrahaman , David Mendez , Gonzalo G. de Polavieja

Machine Learning (ML) applications are proliferating in the enterprise. Relational data which are prevalent in enterprise applications are typically normalized; as a result, data has to be denormalized via primary/foreign-key joins to be…

Machine Learning · Computer Science 2021-03-22 Zhaoyue Chen , Nick Koudas , Zhe Zhang , Xiaohui Yu

Large language models (LLMs) have shown promise in table Question Answering (Table QA). However, extending these capabilities to multi-table QA remains challenging due to unreliable schema linking across complex tables. Existing methods…

Artificial Intelligence · Computer Science 2025-11-25 Xixi Wang , Miguel Costa , Jordanka Kovaceva , Shuai Wang , Francisco C. Pereira

Data processing systems roughly group into families such as relational, array, graph, and key-value. Many data processing tasks exceed the capabilities of any one family, require data stored across families, or run faster when partitioned…

Databases · Computer Science 2016-04-14 Dylan Hutchison , Bill Howe , Dan Suciu

Without any doubt, the relational paradigm has been a huge success. At the same time, we believe that the time is ripe to rethink how database systems could look like if we designed them from scratch. Would we really end up with the same…

Databases · Computer Science 2025-04-18 Jens Dittrich

We present module theory and linear maps as a powerful generalised and computationally efficient framework for the relational data model, which underpins today's relational database systems. Based on universal constructions of modules we…

Programming Languages · Computer Science 2022-07-05 Fritz Henglein , Robin Kaarsgaard , Mikkel Kragh Mathiesen

Machine learning has a long collaborative tradition with several fields of mathematics, such as statistics, probability and linear algebra. We propose a new direction for machine learning research: $C^*$-algebraic ML $-$ a…

Machine Learning · Computer Science 2024-06-10 Yuka Hashimoto , Masahiro Ikeda , Hachem Kadri

Randomized linear algebra (RLA) algorithms are a modern class of numerical linear algebra techniques that play an essential role in scientific computing and machine learning, with broad and growing adoption. However, their discovery remains…

Machine Learning · Computer Science 2026-05-19 Jinglong Xiong , Xiaotian Liu , Ruoxin Wang , Zihang Liu , Yefan Zhou , Yujun Yan , Yaoqing Yang

Linear algebraic expressions are the essence of many computationally intensive problems, including scientific simulations and machine learning applications. However, translating high-level formulations of these expressions to efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-22 Dániel Berényi , András Leitereg , Gábor Lehel

Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual classification, and recommender systems. However,…

Machine Learning · Computer Science 2020-08-11 Meng Wang , Weijie Fu , Xiangnan He , Shijie Hao , Xindong Wu

Large language models (LLMs) are increasingly used to automate feature engineering in tabular learning. Given task-specific information, LLMs can propose diverse feature transformation operations to enhance downstream model performance.…

Machine Learning · Computer Science 2026-01-30 Zhuoyan Li , Aditya Bansal , Jinzhao Li , Shishuang He , Zhuoran Lu , Mutian Zhang , Qin Liu , Yiwei Yang , Swati Jain , Ming Yin , Yunyao Li

The machine learning (ML) training over disparate data sources traditionally involves materialization, which can impose substantial time and space overhead due to data movement and replication. Factorized learning, which leverages direct…

Machine Learning · Computer Science 2025-02-05 Wenbo Sun , Rihan Hai

This paper uses typed linear algebra (LA) to represent data and perform analytical querying in a single, unified framework. The typed approach offers strong type checking (as in modern programming languages) and a diagrammatic way of…

We propose a new formalism for specifying and reasoning about problems that involve heterogeneous "pieces of information" -- large collections of data, decision procedures of any kind and complexity and connections between them. The essence…

Logic in Computer Science · Computer Science 2016-12-30 Eugenia Ternovska

In recent years, Large Language Models (LLMs) have demonstrated remarkable capabilities in parsing textual data and generating code. However, their performance in tasks involving tabular data, especially those requiring symbolic reasoning,…

Computation and Language · Computer Science 2025-04-04 Md Mahadi Hasan Nahid , Davood Rafiei

This tutorial overviews the state of the art in learning models over relational databases and makes the case for a first-principles approach that exploits recent developments in database research. The input to learning classification and…

Databases · Computer Science 2019-11-18 Maximilian Schleich , Dan Olteanu , Mahmoud Abo-Khamis , Hung Q. Ngo , XuanLong Nguyen

We consider a novel backward-compatible paradigm of general data analytics over a recently-reported semisimple algebra (called t-algebra). We study the abstract algebraic framework over the t-algebra by representing the elements of…

Computer Vision and Pattern Recognition · Computer Science 2021-05-04 Liang Liao , Stephen John Maybank

What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100s of billions of parameters) on Big Data (up to terabytes or petabytes)? Modern parallelization…

In modern data analytics, analysts frequently face the challenge of searching for desirable entities by evaluating, for each entity, a collection of its feature relations to derive key analytical properties. This search is challenging…

Databases · Computer Science 2025-07-25 Xi Wu , Eugene Wu , Zichen Zhu , Fengan Li , Jeffrey F. Naughton
‹ Prev 1 2 3 10 Next ›