Related papers: Understanding HTML with Large Language Models

Leveraging Large Language Models for Web Scraping

Large Language Models (LLMs) demonstrate remarkable capabilities in replicating human tasks and boosting productivity. However, their direct application for data extraction presents limitations due to a prioritisation of fluency over…

Computation and Language · Computer Science 2024-06-13 Aman Ahluwalia , Suhrud Wani

"What's important here?": Opportunities and Challenges of Using LLMs in Retrieving Information from Web Interfaces

Large language models (LLMs) that have been trained on a corpus that includes large amount of code exhibit a remarkable ability to understand HTML code. As web interfaces are primarily constructed using HTML, we design an in-depth study to…

Computation and Language · Computer Science 2023-12-12 Faria Huq , Jeffrey P. Bigham , Nikolas Martelaro

Survey of different Large Language Model Architectures: Trends, Benchmarks, and Challenges

Large Language Models (LLMs) represent a class of deep learning models adept at understanding natural language and generating coherent responses to various prompts or queries. These models far exceed the complexity of conventional neural…

Machine Learning · Computer Science 2024-12-05 Minghao Shao , Abdul Basit , Ramesh Karri , Muhammad Shafique

Tiny language models

A prominent achievement of natural language processing (NLP) is its ability to understand and generate meaningful human language. This capability relies on complex feedforward transformer block architectures pre-trained on large language…

Computation and Language · Computer Science 2025-11-11 Ronit D. Gross , Yarden Tzach , Tal Halevi , Ella Koresh , Ido Kanter

Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning

Recent advancements in language models have demonstrated remarkable improvements in various natural language processing (NLP) tasks such as web navigation. Supervised learning (SL) approaches have achieved impressive performance while…

Machine Learning · Computer Science 2024-05-31 Lucas-Andreï Thil , Mirela Popa , Gerasimos Spanakis

On the Potential of Large Language Models to Solve Semantics-Aware Process Mining Tasks

Large language models (LLMs) have shown to be valuable tools for tackling process mining tasks. Existing studies report on their capability to support various data-driven process analyses and even, to some extent, that they are able to…

Databases · Computer Science 2025-05-01 Adrian Rebmann , Fabian David Schmidt , Goran Glavaš , Han van der Aa

Large Language Models Understand Layout

Large language models (LLMs) demonstrate extraordinary abilities in a wide range of natural language processing (NLP) tasks. In this paper, we show that, beyond text understanding capability, LLMs are capable of processing text layouts that…

Computation and Language · Computer Science 2024-08-29 Weiming Li , Manni Duan , Dong An , Yan Shao

AutoWebGLM: A Large Language Model-based Web Navigating Agent

Large language models (LLMs) have fueled many intelligent web agents, but most existing ones perform far from satisfying in real-world web navigation tasks due to three factors: (1) the complexity of HTML text data (2) versatility of…

Computation and Language · Computer Science 2024-10-15 Hanyu Lai , Xiao Liu , Iat Long Iong , Shuntian Yao , Yuxuan Chen , Pengbo Shen , Hao Yu , Hanchen Zhang , Xiaohan Zhang , Yuxiao Dong , Jie Tang

Cross-Task Benchmarking and Evaluation of General-Purpose and Code-Specific Large Language Models

Large Language Models (LLMs) have revolutionized both general natural language processing and domain-specific applications such as code synthesis, legal reasoning, and finance. However, while prior studies have explored individual model…

Software Engineering · Computer Science 2025-12-05 Gunjan Das , Paheli Bhattacharya , Rishabh Gupta

Benchmarking Large Language Models for Molecule Prediction Tasks

Large Language Models (LLMs) stand at the forefront of a number of Natural Language Processing (NLP) tasks. Despite the widespread adoption of LLMs in NLP, much of their potential in broader fields remains largely unexplored, and…

Machine Learning · Computer Science 2024-03-11 Zhiqiang Zhong , Kuangyu Zhou , Davide Mottin

Pre-trained Large Language Models Learn Hidden Markov Models In-context

Hidden Markov Models (HMMs) are foundational tools for modeling sequential data with latent Markovian structure, yet fitting them to real-world data remains computationally challenging. In this work, we show that pre-trained large language…

Machine Learning · Computer Science 2026-04-27 Yijia Dai , Zhaolin Gao , Yahya Sattar , Sarah Dean , Jennifer J. Sun

Evaluating SQL Understanding in Large Language Models

The rise of large language models (LLMs) has significantly impacted various domains, including natural language processing (NLP) and image generation, by making complex computational tasks more accessible. While LLMs demonstrate impressive…

Databases · Computer Science 2024-10-15 Ananya Rahaman , Anny Zheng , Mostafa Milani , Fei Chiang , Rachel Pottinger

HTLM: Hyper-Text Pre-Training and Prompting of Language Models

We introduce HTLM, a hyper-text language model trained on a large-scale web crawl. Modeling hyper-text has a number of advantages: (1) it is easily gathered at scale, (2) it provides rich document-level and end-task-adjacent supervision…

Computation and Language · Computer Science 2021-07-16 Armen Aghajanyan , Dmytro Okhonko , Mike Lewis , Mandar Joshi , Hu Xu , Gargi Ghosh , Luke Zettlemoyer

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding

Large language models (LLMs) have made significant advancements in natural language understanding. However, through that enormous semantic representation that the LLM has learnt, is it somehow possible for it to understand images as well?…

Computer Vision and Pattern Recognition · Computer Science 2024-07-12 Mu Cai , Zeyi Huang , Yuheng Li , Utkarsh Ojha , Haohan Wang , Yong Jae Lee

Large Language Models for Code Analysis: Do LLMs Really Do Their Job?

Large language models (LLMs) have demonstrated significant potential in the realm of natural language understanding and programming code processing tasks. Their capacity to comprehend and generate human-like code has spurred research into…

Software Engineering · Computer Science 2024-03-07 Chongzhou Fang , Ning Miao , Shaurya Srivastav , Jialin Liu , Ruoyu Zhang , Ruijie Fang , Asmita , Ryan Tsang , Najmeh Nazari , Han Wang , Houman Homayoun

Large Language Models as Universal Predictors? An Empirical Study on Small Tabular Datasets

Large Language Models (LLMs), originally developed for natural language processing (NLP), have demonstrated the potential to generalize across modalities and domains. With their in-context learning (ICL) capabilities, LLMs can perform…

Artificial Intelligence · Computer Science 2025-08-26 Nikolaos Pavlidis , Vasilis Perifanis , Symeon Symeonidis , Pavlos S. Efraimidis

Harnessing Webpage UIs for Text-Rich Visual Understanding

Text-rich visual understanding-the ability to process environments where dense textual content is integrated with visuals-is crucial for multimodal large language models (MLLMs) to interact effectively with structured environments. To…

Computer Vision and Pattern Recognition · Computer Science 2024-11-07 Junpeng Liu , Tianyue Ou , Yifan Song , Yuxiao Qu , Wai Lam , Chenyan Xiong , Wenhu Chen , Graham Neubig , Xiang Yue

Exploring Large Language Models for Code Explanation

Automating code documentation through explanatory text can prove highly beneficial in code understanding. Large Language Models (LLMs) have made remarkable strides in Natural Language Processing, especially within software engineering tasks…

Software Engineering · Computer Science 2023-10-26 Paheli Bhattacharya , Manojit Chakraborty , Kartheek N S N Palepu , Vikas Pandey , Ishan Dindorkar , Rakesh Rajpurohit , Rishabh Gupta

How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension

Benchmarking the capabilities and limitations of large language models (LLMs) in graph-related tasks is becoming an increasingly popular and crucial area of research. Recent studies have shown that LLMs exhibit a preliminary ability to…

Machine Learning · Computer Science 2025-04-22 Xinnan Dai , Haohao Qu , Yifen Shen , Bohang Zhang , Qihao Wen , Wenqi Fan , Dongsheng Li , Jiliang Tang , Caihua Shan

Beyond Text: A Deep Dive into Large Language Models' Ability on Understanding Graph Data

Large language models (LLMs) have achieved impressive performance on many natural language processing tasks. However, their capabilities on graph-structured data remain relatively unexplored. In this paper, we conduct a series of…

Machine Learning · Computer Science 2023-10-10 Yuntong Hu , Zheng Zhang , Liang Zhao