Related papers: SampleLLM: Optimizing Tabular Data Synthesis in Re…

Enhancing Table Representations with LLM-powered Synthetic Data Generation

In the era of data-driven decision-making, accurate table-level representations and efficient table recommendation systems are becoming increasingly crucial for improving table management, discovery, and analysis. However, existing…

Machine Learning · Computer Science 2024-11-07 Dayu Yang , Natawut Monaikul , Amanda Ding , Bozhao Tan , Kishore Mosaliganti , Giri Iyengar

A Note on Statistically Accurate Tabular Data Generation Using Large Language Models

Large language models (LLMs) have shown promise in synthetic tabular data generation, yet existing methods struggle to preserve complex feature dependencies, particularly among categorical variables. This work introduces a…

Machine Learning · Computer Science 2025-05-07 Andrey Sidorenko

LLM Meeting Decision Trees on Tabular Data

Tabular data have been playing a vital role in diverse real-world fields, including healthcare, finance, etc. With the recent success of Large Language Models (LLMs), early explorations of extending LLMs to the domain of tabular data have…

Machine Learning · Computer Science 2025-12-11 Hangting Ye , Jinmeng Li , He Zhao , Dandan Guo , Yi Chang

Generating Realistic Tabular Data with Large Language Models

While most generative models show achievements in image data generation, few are developed for tabular data generation. Recently, due to success of large language models (LLM) in diverse tasks, they have also been used for tabular data…

Machine Learning · Computer Science 2024-10-30 Dang Nguyen , Sunil Gupta , Kien Do , Thin Nguyen , Svetha Venkatesh

A Survey on Generative Recommendation: Data, Model, and Tasks

Recommender systems serve as foundational infrastructure in modern information ecosystems, helping users navigate digital content and discover items aligned with their preferences. At their core, recommender systems address a fundamental…

Information Retrieval · Computer Science 2026-05-12 Min Hou , Le Wu , Yuxin Liao , Yonghui Yang , Zhen Zhang , Yu Wang , Changlong Zheng , Han Wu , Richang Hong

Principled Synthetic Data Enables the First Scaling Laws for LLMs in Recommendation

Large Language Models (LLMs) represent a promising frontier for recommender systems, yet their development has been impeded by the absence of predictable scaling laws, which are crucial for guiding research and optimizing resource…

Information Retrieval · Computer Science 2026-02-16 Benyu Zhang , Qiang Zhang , Jianpeng Cheng , Hong-You Chen , Qifei Wang , Wei Sun , Shen Li , Jia Li , Jiahao Wu , Xiangjun Fan , Hong Yan

TabuLa: Harnessing Language Models for Tabular Data Synthesis

Tabular data synthesis is crucial for addressing privacy and security concerns in industries reliant on tabular data. While recent advancements adopt large language models (LLMs) for realistic tabular data generation, their long training…

Machine Learning · Computer Science 2025-02-18 Zilong Zhao , Robert Birke , Lydia Chen

Data Imputation using Large Language Model to Accelerate Recommendation System

This paper aims to address the challenge of sparse and missing data in recommendation systems, a significant hurdle in the age of big data. Traditional imputation methods struggle to capture complex relationships within the data. We propose…

Information Retrieval · Computer Science 2024-08-09 Zhicheng Ding , Jiahao Tian , Zhenkai Wang , Jinman Zhao , Siyang Li

Human-LLM Collaborative Feature Engineering for Tabular Data

Large language models (LLMs) are increasingly used to automate feature engineering in tabular learning. Given task-specific information, LLMs can propose diverse feature transformation operations to enhance downstream model performance.…

Machine Learning · Computer Science 2026-01-30 Zhuoyan Li , Aditya Bansal , Jinzhao Li , Shishuang He , Zhuoran Lu , Mutian Zhang , Qin Liu , Yiwei Yang , Swati Jain , Ming Yin , Yunyao Li

Large Language Models Make Sample-Efficient Recommender Systems

Large language models (LLMs) have achieved remarkable progress in the field of natural language processing (NLP), demonstrating remarkable abilities in producing text that resembles human language for various tasks. This opens up new…

Information Retrieval · Computer Science 2024-06-05 Jianghao Lin , Xinyi Dai , Rong Shan , Bo Chen , Ruiming Tang , Yong Yu , Weinan Zhang

TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

We introduce TableLLM, a robust large language model (LLM) with 8 billion parameters, purpose-built for proficiently handling tabular data manipulation tasks, whether they are embedded within documents or spreadsheets, catering to…

Computation and Language · Computer Science 2025-02-18 Xiaokang Zhang , Sijia Luo , Bohan Zhang , Zeyao Ma , Jing Zhang , Yang Li , Guanlin Li , Zijun Yao , Kangli Xu , Jinchang Zhou , Daniel Zhang-Li , Jifan Yu , Shu Zhao , Juanzi Li , Jie Tang

Harnessing LLMs Explanations to Boost Surrogate Models in Tabular Data Classification

Large Language Models (LLMs) have shown remarkable ability in solving complex tasks, making them a promising tool for enhancing tabular learning. However, existing LLM-based methods suffer from high resource requirements, suboptimal…

Machine Learning · Computer Science 2025-05-12 Ruxue Shi , Hengrui Gu , Xu Shen , Xin Wang

A Survey on Data Synthesis and Augmentation for Large Language Models

The success of Large Language Models (LLMs) is inherently linked to the availability of vast, diverse, and high-quality data for training and evaluation. However, the growth rate of high-quality data is significantly outpaced by the…

Computation and Language · Computer Science 2024-10-18 Ke Wang , Jiahui Zhu , Minjie Ren , Zeming Liu , Shiwei Li , Zongye Zhang , Chenkai Zhang , Xiaoyu Wu , Qiqi Zhan , Qingjie Liu , Yunhong Wang

MALLM-GAN: Multi-Agent Large Language Model as Generative Adversarial Network for Synthesizing Tabular Data

In the era of big data, access to abundant data is crucial for driving research forward. However, such data is often inaccessible due to privacy concerns or high costs, particularly in healthcare domain. Generating synthetic (tabular) data…

Machine Learning · Computer Science 2026-04-10 Yaobin Ling , Xiaoqian Jiang , Yejin Kim

Limited Reference, Reliable Generation: A Two-Component Framework for Tabular Data Generation in Low-Data Regimes

Synthetic tabular data generation is increasingly essential in data management, supporting downstream applications when real-world and high-quality tabular data is insufficient. Existing tabular generation approaches, such as generative…

Machine Learning · Computer Science 2025-09-15 Mingxuan Jiang , Yongxin Wang , Ziyue Dai , Yicun Liu , Hongyi Nie , Sen Liu , Hongfeng Chai

SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models

Large language models (LLMs) have been widely adopted due to their remarkable performance across various applications, driving the accelerated development of a large number of diverse models. However, these individual LLMs show limitations…

Computation and Language · Computer Science 2025-06-13 Kaushal Kumar Maurya , KV Aditya Srivatsa , Ekaterina Kochmar

Aligning Large Language Models for Controllable Recommendations

Inspired by the exceptional general intelligence of Large Language Models (LLMs), researchers have begun to explore their application in pioneering the next generation of recommender systems - systems that are conversational, explainable,…

Information Retrieval · Computer Science 2024-08-06 Wensheng Lu , Jianxun Lian , Wei Zhang , Guanghua Li , Mingyang Zhou , Hao Liao , Xing Xie

SelectLLM: Can LLMs Select Important Instructions to Annotate?

Instruction tuning benefits from large and diverse datasets; however, creating such datasets involves a high cost of human labeling. While synthetic datasets generated by large language models (LLMs) have partly solved this issue, they…

Computation and Language · Computer Science 2024-08-28 Ritik Sachin Parkar , Jaehyung Kim , Jong Inn Park , Dongyeop Kang

A Survey on Large Language Models for Recommendation

Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) and have recently gained significant attention in the domain of Recommendation Systems (RS). These models, trained on massive…

Information Retrieval · Computer Science 2024-06-19 Likang Wu , Zhi Zheng , Zhaopeng Qiu , Hao Wang , Hongchao Gu , Tingjia Shen , Chuan Qin , Chen Zhu , Hengshu Zhu , Qi Liu , Hui Xiong , Enhong Chen

Towards Next-Generation LLM-based Recommender Systems: A Survey and Beyond

Large language models (LLMs) have not only revolutionized the field of natural language processing (NLP) but also have the potential to bring a paradigm shift in many other fields due to their remarkable abilities of language understanding,…

Information Retrieval · Computer Science 2024-10-29 Qi Wang , Jindong Li , Shiqi Wang , Qianli Xing , Runliang Niu , He Kong , Rui Li , Guodong Long , Yi Chang , Chengqi Zhang