English
Related papers

Related papers: Optimization Techniques for Unsupervised Complex T…

200 papers

Tables convey factual and quantitative data with implicit conventions created by humans that are often challenging for machines to parse. Prior work on table recognition (TR) has mainly centered around complex task-specific combinations of…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 ShengYun Peng , Aishwarya Chakravarthy , Seongmin Lee , Xiaojing Wang , Rajarajeswari Balasubramaniyan , Duen Horng Chau

Existing approaches to constructing training data for Natural Language Inference (NLI) tasks, such as for semi-structured table reasoning, are either via crowdsourcing or fully automatic methods. However, the former is expensive and…

Computation and Language · Computer Science 2022-10-25 Dibyakanti Kumar , Vivek Gupta , Soumya Sharma , Shuo Zhang

Creating challenging tabular inference data is essential for learning complex reasoning. Prior work has mostly relied on two data generation strategies. The first is human annotation, which yields linguistically diverse data but is…

Computation and Language · Computer Science 2022-11-24 Aashna Jena , Vivek Gupta , Manish Shrivastava , Julian Martin Eisenschlos

Table reasoning (TR) requires structured reasoning over semi-structured tabular data and remains challenging, particularly for small language models (SLMs, e.g., LLaMA-8B) due to their limited capacity compared to large LMs (LLMs, e.g.,…

Machine Learning · Computer Science 2025-06-09 Rihui Jin , Zheyu Xin , Xing Xie , Zuoyi Li , Guilin Qi , Yongrui Chen , Xinbang Dai , Tongtong Wu , Gholamreza Haffari

Table reasoning, including tabular QA and fact verification, often depends on annotated data or complex data augmentation, limiting flexibility and generalization. LLMs, despite their versatility, often underperform compared to simple…

Artificial Intelligence · Computer Science 2025-11-19 Yiran Rex Ma

Neural dependency parsing has proven very effective, achieving state-of-the-art results on numerous domains and languages. Unfortunately, it requires large amounts of labeled data, that is costly and laborious to create. In this paper we…

Computation and Language · Computer Science 2019-11-12 Guy Rotman , Roi Reichart

The reasoning abilities of large language models (LLMs) have improved with chain-of-thought (CoT) prompting, allowing models to solve complex tasks stepwise. However, training CoT capabilities requires detailed reasoning data, which is…

Artificial Intelligence · Computer Science 2025-04-11 Fu-Chieh Chang , Yu-Ting Lee , Hui-Ying Shih , Yi Hsuan Tseng , Pei-Yuan Wu

The prevailing paradigm for training large reasoning models--combining Supervised Fine-Tuning (SFT) with Reinforcement Learning with Verifiable Rewards (RLVR)--is fundamentally constrained by its reliance on high-quality, human-annotated…

Machine Learning · Computer Science 2026-03-24 Yuanfu Wang , Zhixuan Liu , Xiangtian Li , Chaochao Lu , Chao Yang

Current Large Language Models (LLMs) exhibit limited ability to understand table structures and to apply precise numerical reasoning, which is crucial for tasks such as table question answering (TQA) and table-based fact verification (TFV).…

Computation and Language · Computer Science 2025-07-11 Xinyuan Lu , Liangming Pan , Yubo Ma , Preslav Nakov , Min-Yen Kan

Table reasoning, encompassing tasks such as table question answering, fact verification, and text-to-SQL, requires precise understanding of structured tabular data, coupled with numerical computation and code manipulation for effective…

Computation and Language · Computer Science 2025-06-03 Fangyu Lei , Jinxiang Meng , Yiming Huang , Tinghong Chen , Yun Zhang , Shizhu He , Jun Zhao , Kang Liu

Expanding new functionalities efficiently is an ongoing challenge for single-turn task-oriented dialogue systems. In this work, we explore functionality-specific semi-supervised learning via self-training. We consider methods that augment…

Computation and Language · Computer Science 2019-10-11 Eunah Cho , He Xie , John P. Lalor , Varun Kumar , William M. Campbell

Benchmark datasets for table structure recognition (TSR) must be carefully processed to ensure they are annotated consistently. However, even if a dataset's annotations are self-consistent, there may be significant inconsistency across…

Computer Vision and Pattern Recognition · Computer Science 2023-05-25 Brandon Smock , Rohith Pesala , Robin Abraham

Data preparation, also called data wrangling, is considered one of the most expensive and time-consuming steps when performing analytics or building machine learning models. Preparing data typically involves collecting and merging data from…

Computation and Language · Computer Science 2023-06-22 Michael Glass , Xueqing Wu , Ankita Rajaram Naik , Gaetano Rossiello , Alfio Gliozzo

Data scarcity has been the main factor that hinders the progress of event extraction. To overcome this issue, we propose a Self-Training with Feedback (STF) framework that leverages the large-scale unlabeled data and acquires feedback for…

Computation and Language · Computer Science 2023-08-03 Zhiyang Xu , Jay-Yoon Lee , Lifu Huang

Tabular data is one of the most widely used data modalities, encompassing numerous datasets with substantial amounts of unlabeled data. Despite this prevalence, there is a notable lack of simple and versatile methods for utilizing unlabeled…

Machine Learning · Computer Science 2024-08-30 Minwook Kim , Juseong Kim , Ki Beom Kim , Giltae Song

Since we can leverage a large amount of unlabeled data without any human supervision to train a model and transfer the knowledge to target tasks, self-supervised learning is a de-facto component for the recent success of deep learning in…

Computation and Language · Computer Science 2021-03-12 Donggyu Kim , Seanie Lee

Tabular data is the most widely used data format in machine learning (ML). While tree-based methods outperform DL-based methods in supervised learning, recent literature reports that self-supervised learning with Transformer-based models…

Machine Learning · Computer Science 2023-05-23 Soma Onishi , Shoya Meguro

Tabular data serves as the backbone of modern data analysis and scientific research. While Large Language Models (LLMs) fine-tuned via Supervised Fine-Tuning (SFT) have significantly improved natural language interaction with such…

Table reasoning with large language models (LLMs) plays a critical role in building intelligent systems capable of understanding and analyzing tabular data. Despite recent progress, existing methods still face key limitations: their…

Artificial Intelligence · Computer Science 2026-01-27 Huajian Zhang , Mingyue Cheng , Yucong Luo , Xiaoyu Tao

Self-supervised learning has been shown to be very effective in learning useful representations, and yet much of the success is achieved in data types such as images, audio, and text. The success is mainly enabled by taking advantage of…

Machine Learning · Computer Science 2021-10-28 Talip Ucar , Ehsan Hajiramezanali , Lindsay Edwards
‹ Prev 1 2 3 10 Next ›