Related papers: Beyond Browsing: API-Based Web Agents

FP-Agent: Fingerprinting AI Browsing Agents

AI browsing agents are an emerging class of AI-powered bots capable of autonomously navigating websites. Unlike traditional web bots, AI browsing agents typically operate using real browsers and perform everyday tasks, making them difficult…

Cryptography and Security · Computer Science 2026-05-05 Ethan Wang , Zubair Shafiq , Yash Vekaria

Build the web for agents, not agents for the web

Recent advancements in Large Language Models (LLMs) and multimodal counterparts have spurred significant interest in developing web agents -- AI systems capable of autonomously navigating and completing tasks within web environments. While…

Machine Learning · Computer Science 2025-06-13 Xing Han Lù , Gaurav Kamath , Marius Mosbach , Siva Reddy

AI Agents for Web Testing: A Case Study in the Wild

Automated web testing plays a critical role in ensuring high-quality user experiences and delivering business value. Traditional approaches primarily focus on code coverage and load testing, but often fall short of capturing complex user…

Software Engineering · Computer Science 2025-09-08 Naimeng Ye , Xiao Yu , Ruize Xu , Tianyi Peng , Zhou Yu

API Agents vs. GUI Agents: Divergence and Convergence

Large language models (LLMs) have evolved beyond simple text generation to power software agents that directly translate natural language commands into tangible actions. While API-based LLM agents initially rose to prominence for their…

Artificial Intelligence · Computer Science 2025-06-24 Chaoyun Zhang , Shilin He , Liqun Li , Si Qin , Yu Kang , Qingwei Lin , Saravan Rajmohan , Dongmei Zhang

GUI Agents: A Survey

Graphical User Interface (GUI) agents, powered by Large Foundation Models, have emerged as a transformative approach to automating human-computer interaction. These agents autonomously interact with digital systems or software applications…

Artificial Intelligence · Computer Science 2025-09-30 Dang Nguyen , Jian Chen , Yu Wang , Gang Wu , Namyong Park , Zhengmian Hu , Hanjia Lyu , Junda Wu , Ryan Aponte , Yu Xia , Xintong Li , Jing Shi , Hongjie Chen , Viet Dac Lai , Zhouhang Xie , Sungchul Kim , Ruiyi Zhang , Tong Yu , Mehrab Tanjim , Nesreen K. Ahmed , Puneet Mathur , Seunghyun Yoon , Lina Yao , Branislav Kveton , Jihyung Kil , Thien Huu Nguyen , Trung Bui , Tianyi Zhou , Ryan A. Rossi , Franck Dernoncourt

Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

AI agents today are mostly siloed - they either retrieve and reason over vast amount of digital information and knowledge obtained online; or interact with the physical world through embodied perception, planning and action - but rarely…

Artificial Intelligence · Computer Science 2025-07-31 Yining Hong , Rui Sun , Bingxuan Li , Xingcheng Yao , Maxine Wu , Alexander Chien , Da Yin , Ying Nian Wu , Zhecan James Wang , Kai-Wei Chang

One Agent Too Many: User Perspectives on Approaches to Multi-agent Conversational AI

Conversational agents have been gaining increasing popularity in recent years. Influenced by the widespread adoption of task-oriented agents such as Apple Siri and Amazon Alexa, these agents are being deployed into various applications to…

Human-Computer Interaction · Computer Science 2024-01-17 Christopher Clarke , Karthik Krishnamurthy , Walter Talamonti , Yiping Kang , Lingjia Tang , Jason Mars

Infrastructure for AI Agents

AI agents plan and execute interactions in open-ended environments. For example, OpenAI's Operator can use a web browser to do product comparisons and buy online goods. Much research on making agents useful and safe focuses on directly…

Artificial Intelligence · Computer Science 2025-06-23 Alan Chan , Kevin Wei , Sihao Huang , Nitarshan Rajkumar , Elija Perrier , Seth Lazar , Gillian K. Hadfield , Markus Anderljung

AI Agentic workflows and Enterprise APIs: Adapting API architectures for the age of AI agents

The rapid advancement of Generative AI has catalyzed the emergence of autonomous AI agents, presenting unprecedented challenges for enterprise computing infrastructures. Current enterprise API architectures are predominantly designed for…

Software Engineering · Computer Science 2025-02-26 Vaibhav Tupe , Shrinath Thube

Orca: Browsing at Scale Through User-Driven and AI-Facilitated Orchestration Across Malleable Webpages

Web-based activities span multiple webpages. However, conventional browsers with stacks of tabs cannot support operating and synthesizing large volumes of information across pages. While recent AI systems enable fully automated web browsing…

Human-Computer Interaction · Computer Science 2026-01-29 Peiling Jiang , Haijun Xia

In-Browser Agents for Search Assistance

A fundamental tension exists between the demand for sophisticated AI assistance in web search and the need for user data privacy. Current centralized models require users to transmit sensitive browsing data to external services, which…

Human-Computer Interaction · Computer Science 2026-01-16 Saber Zerhoudi , Michael Granitzer

Toward Super Agent System with Hybrid AI Routers

AI Agents powered by Large Language Models are transforming the world through enormous applications. A super agent has the potential to fulfill diverse user needs, such as summarization, coding, and research, by accurately understanding…

Artificial Intelligence · Computer Science 2025-07-28 Yuhang Yao , Haixin Wang , Yibo Chen , Jiawen Wang , Min Chang Jordan Ren , Bosheng Ding , Salman Avestimehr , Chaoyang He

Inducing Programmatic Skills for Agentic Tasks

To succeed in common digital tasks such as web navigation, agents must carry out a variety of specialized tasks such as searching for products or planning a travel route. To tackle these tasks, agents can bootstrap themselves by learning…

Computation and Language · Computer Science 2025-09-01 Zora Zhiruo Wang , Apurva Gandhi , Graham Neubig , Daniel Fried

On the Regulatory Potential of User Interfaces for AI Agent Governance

AI agents that take actions in their environment autonomously over extended time horizons require robust governance interventions to curb their potentially consequential risks. Prior proposals for governing AI agents primarily target…

Computers and Society · Computer Science 2025-12-02 K. J. Kevin Feng , Tae Soo Kim , Rock Yuren Pang , Faria Huq , Tal August , Amy X. Zhang

How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations

AI agents are continually optimized for tasks related to human work, such as software engineering and professional writing, signaling a pressing trend with significant impacts on the human workforce. However, these agent developments have…

Artificial Intelligence · Computer Science 2025-11-10 Zora Zhiruo Wang , Yijia Shao , Omar Shaikh , Daniel Fried , Graham Neubig , Diyi Yang

Exploring Current User Web Search Behaviours in Analysis Tasks to be Supported in Conversational Search

Conversational search presents opportunities to support users in their search activities to improve the effectiveness and efficiency of search while reducing their cognitive load. Limitations of the potential competency of conversational…

Human-Computer Interaction · Computer Science 2021-04-12 Abhishek Kaushik , Gareth J. F. Jones

Responsible AI Agents

Thanks to advances in large language models, a new type of software agent, the artificial intelligence (AI) agent, has entered the marketplace. Companies such as OpenAI, Google, Microsoft, and Salesforce promise their AI Agents will go from…

Computers and Society · Computer Science 2025-02-26 Deven R. Desai , Mark O. Riedl

Promoting Sustainable Web Agents: Benchmarking and Estimating Energy Consumption through Empirical and Theoretical Analysis

Web agents, like OpenAI's Operator and Google's Project Mariner, are powerful agentic systems pushing the boundaries of Large Language Models (LLM). They can autonomously interact with the internet at the user's behest, such as navigating…

Artificial Intelligence · Computer Science 2025-11-07 Lars Krupp , Daniel Geißler , Vishal Banwari , Paul Lukowicz , Jakob Karolus

An experimental platform for gathering user behavioural data via browser APIs

Websites are capable of learning a wide range of information about the platform on which a browser is executing. One major source of such information is the set of standardised Application Programming Interfaces (APIs) provided within the…

Cryptography and Security · Computer Science 2019-10-17 Zhaoyi Fan

Developer Experience with AI Coding Agents: HTTP Behavioral Signatures in Documentation Portals

The rapid adoption of AI coding agents and AI assistant web services is fundamentally changing how developers discover, consume, and interact with technical documentation. This paper studies that transformation across three interconnected…

Software Engineering · Computer Science 2026-04-06 Oleksii Borysenko