Related papers: On Linearizing Structured Data in Encoder-Decoder …

Let Your Graph Do the Talking: Encoding Structured Data for LLMs

How can we best encode structured data into sequential form for use in large language models (LLMs)? In this work, we introduce a parameter-efficient method to explicitly represent structured data for LLMs. Our method, GraphToken, learns an…

Machine Learning · Computer Science 2024-02-09 Bryan Perozzi , Bahare Fatemi , Dustin Zelle , Anton Tsitsulin , Mehran Kazemi , Rami Al-Rfou , Jonathan Halcrow

Innovative tokenisation of structured data for LLM training

Data representation remains a fundamental challenge in machine learning, particularly when adapting sequence-based architectures like Transformers and Large Language Models (LLMs) for structured tabular data. Existing methods often fail to…

Machine Learning · Computer Science 2025-08-05 Kayvan Karim , Hani Ragab Hassen. Hadj Batatia

Graph Linearization Methods for Reasoning on Graphs with Large Language Models

Large language models have evolved to process multiple modalities beyond text, such as images and audio, which motivates us to explore how to effectively leverage them for graph reasoning tasks. The key question, therefore, is how to…

Computation and Language · Computer Science 2025-06-26 Christos Xypolopoulos , Guokan Shang , Xiao Fei , Giannis Nikolentzos , Hadi Abdine , Iakovos Evdaimon , Michail Chatzianastasis , Giorgos Stamou , Michalis Vazirgiannis

Representation Learning of Structured Data for Medical Foundation Models

Large Language Models (LLMs) have demonstrated remarkable performance across various domains, including healthcare. However, their ability to effectively represent structured non-textual data, such as the alphanumeric medical codes used in…

Computation and Language · Computer Science 2024-10-18 Vijay Prakash Dwivedi , Viktor Schlegel , Andy T. Liu , Thanh-Tung Nguyen , Abhinav Ramesh Kashyap , Jeng Wei , Wei-Hsian Yin , Stefan Winkler , Robby T. Tan

Struc-EMB: The Potential of Structure-Aware Encoding in Language Embeddings

Text embeddings from Large Language Models (LLMs) have become foundational for numerous applications. However, these models typically operate on raw text, overlooking the rich structural information, such as hyperlinks or citations, that…

Machine Learning · Computer Science 2025-10-13 Shikun Liu , Haoyu Wang , Mufei Li , Pan Li

When Structure Doesn't Help: LLMs Do Not Read Text-Attributed Graphs as Effectively as We Expected

Graphs provide a unified representation of semantic content and relational structure, making them a natural fit for domains such as molecular modeling, citation networks, and social graphs. Meanwhile, large language models (LLMs) have…

Machine Learning · Computer Science 2026-05-04 Haotian Xu , Yuning You , Tengfei Ma

Large Language Models are Good Relational Learners

Large language models (LLMs) have demonstrated remarkable capabilities across various domains, yet their application to relational deep learning (RDL) remains underexplored. Existing approaches adapt LLMs by traversing relational links…

Computation and Language · Computer Science 2025-06-09 Fang Wu , Vijay Prakash Dwivedi , Jure Leskovec

Linearity of Relation Decoding in Transformer Language Models

Much of the knowledge encoded in transformer language models (LMs) may be expressed in terms of relations: relations between words and their synonyms, entities and their attributes, etc. We show that, for a subset of relations, this…

Computation and Language · Computer Science 2024-02-19 Evan Hernandez , Arnab Sen Sharma , Tal Haklay , Kevin Meng , Martin Wattenberg , Jacob Andreas , Yonatan Belinkov , David Bau

Colorful Talks with Graphs: Human-Interpretable Graph Encodings for Large Language Models

Graph problems are fundamentally challenging for large language models (LLMs). While LLMs excel at processing unstructured text, graph tasks require reasoning over explicit structure, permutation invariance, and computationally complex…

Machine Learning · Computer Science 2026-04-23 Angelo Zangari , Peyman Baghershahi , Sourav Medya

A Hierarchical Model for Data-to-Text Generation

Transcribing structured data into natural language descriptions has emerged as a challenging task, referred to as "data-to-text". These structures generally regroup multiple elements, as well as their attributes. Most attempts rely on…

Computation and Language · Computer Science 2019-12-23 Clément Rebuffel , Laure Soulier , Geoffrey Scoutheeten , Patrick Gallinari

Embeddings and Representation Learning for Structured Data

Performing machine learning on structured data is complicated by the fact that such data does not have vectorial form. Therefore, multiple approaches have emerged to construct vectorial representations of structured data, from kernel and…

Machine Learning · Computer Science 2019-05-16 Benjamin Paaßen , Claudio Gallicchio , Alessio Micheli , Alessandro Sperduti

LLaSA: Large Language and Structured Data Assistant

Structured data, such as tables, graphs, and databases, play a critical role in plentiful NLP tasks such as question answering and dialogue system. Recently, inspired by Vision-Language Models, Graph Neutral Networks (GNNs) have been…

Computation and Language · Computer Science 2025-02-11 Yao Xu , Shizhu He , Jiabei Chen , Zeng Xiangrong , Bingning Wang , Guang Liu , Jun Zhao , Kang Liu

Learning and analyzing vector encoding of symbolic representations

We present a formal language with expressions denoting general symbol structures and queries which access information in those structures. A sequence-to-sequence network processing this language learns to encode symbol structures and query…

Artificial Intelligence · Computer Science 2018-03-13 Roland Fernandez , Asli Celikyilmaz , Rishabh Singh , Paul Smolensky

Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study

Large language models (LLMs) are becoming attractive as few-shot reasoners to solve Natural Language (NL)-related tasks. However, the understanding of their capability to process structured data like tables remains an under-explored area.…

Computation and Language · Computer Science 2024-07-18 Yuan Sui , Mengyu Zhou , Mingjie Zhou , Shi Han , Dongmei Zhang

From Anchors to Answers: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models

Enabling large language models (LLMs) to effectively process and reason with graph-structured data remains a significant challenge despite their remarkable success in natural language tasks. Current approaches either convert graph…

Artificial Intelligence · Computer Science 2025-09-03 Yanbiao Ji , Chang Liu , Xin Chen , Dan Luo , Mei Li , Yue Ding , Wenqing Lin , Hongtao Lu

Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study

The Natural Language to Visualization (NL2Vis) task aims to transform natural-language descriptions into visual representations for a grounded table, enabling users to gain insights from vast amounts of data. Recently, many deep…

Databases · Computer Science 2024-04-29 Yang Wu , Yao Wan , Hongyu Zhang , Yulei Sui , Wucai Wei , Wei Zhao , Guandong Xu , Hai Jin

Nodes Are Early, Edges Are Late: Probing Diagram Representations in Large Vision-Language Models

Large vision-language models (LVLMs) demonstrate strong performance on diagram understanding benchmarks, yet they still struggle with understanding relationships between elements, particularly those represented by nodes and directed edges…

Computation and Language · Computer Science 2026-03-04 Haruto Yoshida , Keito Kudo , Yoichi Aoki , Ryota Tanaka , Itsumi Saito , Keisuke Sakaguchi , Kentaro Inui

Scalable Representation Learning for Multimodal Tabular Transactions

Large language models (LLMs) are primarily designed to understand unstructured text. When directly applied to structured formats such as tabular data, they may struggle to discern inherent relationships and overlook critical patterns. While…

Machine Learning · Computer Science 2024-10-11 Natraj Raman , Sumitra Ganesh , Manuela Veloso

Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance

Large language models (LLMs) have shown promise in table Question Answering (Table QA). However, extending these capabilities to multi-table QA remains challenging due to unreliable schema linking across complex tables. Existing methods…

Artificial Intelligence · Computer Science 2025-11-25 Xixi Wang , Miguel Costa , Jordanka Kovaceva , Shuai Wang , Francisco C. Pereira

Multi-View Empowered Structural Graph Wordification for Language Models

Significant efforts have been dedicated to integrating the powerful Large Language Models (LLMs) with diverse modalities, particularly focusing on the fusion of language, vision and audio data. However, the graph-structured data, which is…

Computation and Language · Computer Science 2024-12-31 Zipeng Liu , Likang Wu , Ming He , Zhong Guan , Hongke Zhao , Nan Feng