Related papers: Efficient and Compact Spreadsheet Formula Graphs

End-to-End Compression for Tabular Foundation Models

The long-standing dominance of gradient-boosted decision trees for tabular data has recently been challenged by in-context learning tabular foundation models. In-context learning methods fit and predict in one forward pass without parameter…

Machine Learning · Computer Science 2026-02-06 Guri Zabërgja , Rafiq Kamel , Arlind Kadra , Christian M. M. Frey , Josif Grabocka

Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations

Spreadsheets are widely recognized as the most popular end-user programming tools, which blend the power of formula-based computation, with an intuitive table-based interface. Today, spreadsheets are used by billions of users to manipulate…

Databases · Computer Science 2024-04-22 Sibei Chen , Yeye He , Weiwei Cui , Ju Fan , Song Ge , Haidong Zhang , Dongmei Zhang , Surajit Chaudhuri

Sheet as Token: A Graph-Enhanced Representation for Multi-Sheet Spreadsheet Understanding

Workbook-scale spreadsheet understanding is increasingly important for language-model-based data analysis agents, but remains challenging because relevant information is often distributed across multiple sheets with heterogeneous schemas,…

Artificial Intelligence · Computer Science 2026-05-08 Yiming Lei , Yiqi Wang , Yujia Zhang , Bo Guan , Depei Zhu , Chunhui Wang , Zhuonan Hao , Tianyu Shi

Towards an Efficient Discovery of the Topological Representative Subgraphs

With the emergence of graph databases, the task of frequent subgraph discovery has been extensively addressed. Although the proposed approaches in the literature have made this task feasible, the number of discovered frequent subgraphs is…

Databases · Computer Science 2013-08-16 Wajdi Dhifli , Mohamed Moussaoui , Rabie Saidi , Engelbert Mephu Nguifo

Balanced Co-Clustering of Users and Items for Embedding Table Compression in Recommender Systems

Recommender systems have advanced markedly over the past decade by transforming each user/item into a dense embedding vector with deep learning models. At industrial scale, embedding tables constituted by such vectors of all users/items…

Information Retrieval · Computer Science 2026-04-21 Runhao Jiang , Renchi Yang , Donghao Wu

A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

As terminal agents scale to long-horizon, multi-turn workflows, a key bottleneck is not merely limited context length, but the accumulation of noisy terminal observations in the interaction history. Retaining raw observations preserves…

Computation and Language · Computer Science 2026-05-18 Jincheng Ren , Siwei Wu , Yizhi Li , Kang Zhu , Shu Xu , Boyu Feng , Ruibin Yuan , Wei Zhang , Riza Batista-Navarro , Jian Yang , Chenghua Lin

Tabularis Formatus: Predictive Formatting for Tables

Spreadsheet manipulation software are widely used for data management and analysis of tabular data, yet the creation of conditional formatting (CF) rules remains a complex task requiring technical knowledge and experience with specific…

Databases · Computer Science 2025-08-18 Mukul Singh , José Cambronero , Sumit Gulwani , Vu Le , Gust Verbruggen

TRAKO: Efficient Transmission of Tractography Data for Visualization

Fiber tracking produces large tractography datasets that are tens of gigabytes in size consisting of millions of streamlines. Such vast amounts of data require formats that allow for efficient storage, transfer, and visualization. We…

Image and Video Processing · Electrical Eng. & Systems 2020-04-29 Daniel Haehn , Loraine Franke , Fan Zhang , Suheyla Cetin Karayumak , Steve Pieper , Lauren O'Donnell , Yogesh Rathi

Querying Spreadsheets: An Empirical Study

One of the most important assets of any company is being able to easily access information on itself and on its business. In this line, it has been observed that this important information is often stored in one of the millions of…

Software Engineering · Computer Science 2015-03-02 Jácome Cunha , João Paulo Fernandes , Rui Pereira , João Saraiva

TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training

Handling communication overhead in large-scale tensor-parallel training remains a critical challenge due to the dense, near-zero distributions of intermediate tensors, which exacerbate errors under frequent communication and introduce…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-28 Man Liu , Xingchen Liu , Xingjian Tian , Bing Lu , Shengkay Lyu , Shengquan Yin , Wenjing Huang , Zheng Wei , Hairui Zhao , Guangming Tan , Dingwen Tao

GraphScope Flex: LEGO-like Graph Computing Stack

Graph computing has become increasingly crucial in processing large-scale graph data, with numerous systems developed for this purpose. Two years ago, we introduced GraphScope as a system addressing a wide array of graph computing needs,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-12-20 Tao He , Shuxian Hu , Longbin Lai , Dongze Li , Neng Li , Xue Li , Lexiao Liu , Xiaojian Luo , Binqing Lyu , Ke Meng , Sijie Shen , Li Su , Lei Wang , Jingbo Xu , Wenyuan Yu , Weibin Zeng , Lei Zhang , Siyuan Zhang , Jingren Zhou , Xiaoli Zhou , Diwen Zhu

E^2GraphRAG: Streamlining Graph-based RAG for High Efficiency and Effectiveness

Graph-based RAG methods like GraphRAG have shown promising global understanding of the knowledge base by constructing hierarchical entity graphs. However, they often suffer from inefficiency and rely on manually pre-defined query modes,…

Artificial Intelligence · Computer Science 2025-06-09 Yibo Zhao , Jiapeng Zhu , Ye Guo , Kangkang He , Xiang Li

Spreadsheet computing with Finite Domain Constraint Enhancements

Spreadsheet computing is one of the more popular computing methodologies in today's modern society. The spreadsheet application's ease of use and usefulness has enabled non-programmers to perform programming-like tasks in a familiar setting…

Artificial Intelligence · Computer Science 2022-03-22 Ezana N. Beyenne

Enhanced Spreadsheet Computing with Finite-Domain Constraint Satisfaction

The spreadsheet application is among the most widely used computing tools in modern society. It provides excellent usability and usefulness, and it easily enables a non-programmer to perform programming-like tasks in a visual tabular "pen…

Programming Languages · Computer Science 2022-03-31 Ezana N. Beyenne , Hai-Feng Guo

TAPER: query-aware, partition-enhancement for large, heterogenous, graphs

Graph partitioning has long been seen as a viable approach to address Graph DBMS scalability. A partitioning, however, may introduce extra query processing latency unless it is sensitive to a specific query workload, and optimised to…

Databases · Computer Science 2016-06-24 Hugo Firth , Paolo Missier

Survey and Taxonomy of Lossless Graph Compression and Space-Efficient Graph Representations

Various graphs such as web or social networks may contain up to trillions of edges. Compressing such datasets can accelerate graph processing by reducing the amount of I/O accesses and the pressure on the memory subsystem. Yet, selecting a…

Data Structures and Algorithms · Computer Science 2019-04-30 Maciej Besta , Torsten Hoefler

TableTalk: Scaffolding Spreadsheet Development with a Language Agent

Spreadsheet programming is challenging. Programmers use spreadsheet programming knowledge (e.g., formulas) and problem-solving skills to combine actions into complex tasks. Advancements in large language models have introduced language…

Software Engineering · Computer Science 2025-08-27 Jenny T. Liang , Aayush Kumar , Yasharth Bajpai , Sumit Gulwani , Vu Le , Chris Parnin , Arjun Radhakrishna , Ashish Tiwari , Emerson Murphy-Hill , Guastavo Soares

Compressing Structured Tensor Algebra

Tensor algebra is a crucial component for data-intensive workloads such as machine learning and scientific computing. As the complexity of data grows, scientists often encounter a dilemma between the highly specialized dense tensor algebra…

Programming Languages · Computer Science 2024-07-19 Mahdi Ghorbani , Emilien Bauer , Tobias Grosser , Amir Shaikhha

An Information-theoretic Framework for the Lossy Compression of Link Streams

Graph compression is a data analysis technique that consists in the replacement of parts of a graph by more general structural patterns in order to reduce its description length. It notably provides interesting exploration tools for the…

Data Structures and Algorithms · Computer Science 2018-07-19 Robin Lamarche-Perrin

PASCO (PArallel Structured COarsening): an overlay to speed up graph clustering algorithms

Clustering the nodes of a graph is a cornerstone of graph analysis and has been extensively studied. However, some popular methods are not suitable for very large graphs: e.g., spectral clustering requires the computation of the spectral…

Machine Learning · Computer Science 2025-06-13 Etienne Lasalle , Rémi Vaudaine , Titouan Vayer , Pierre Borgnat , Rémi Gribonval , Paulo Gonçalves , Màrton Karsai