Related papers: Evaluating Human-Language Model Interaction

A Map of Exploring Human Interaction patterns with LLM: Insights into Collaboration and Creativity

The outstanding performance capabilities of large language model have driven the evolution of current AI system interaction patterns. This has led to considerable discussion within the Human-AI Interaction (HAII) community. Numerous studies…

Human-Computer Interaction · Computer Science 2026-04-02 Jiayang Li , Jiale Li

HAL: Inducing Human-likeness in LLMs with Alignment

Conversational human-likeness plays a central role in human-AI interaction, yet it has remained difficult to define, measure, and optimize. As a result, improvements in human-like behavior are largely driven by scale or broad supervised…

Artificial Intelligence · Computer Science 2026-01-08 Masum Hasan , Junjie Zhao , Ehsan Hoque

Applying the Gricean Maxims to a Human-LLM Interaction Cycle: Design Insights from a Participatory Approach

While large language models (LLMs) are increasingly used to assist users in various tasks through natural language interactions, these interactions often fall short due to LLMs' limited ability to infer contextual nuances and user…

Human-Computer Interaction · Computer Science 2025-03-04 Yoonsu Kim , Brandon Chin , Kihoon Son , Seoyoung Kim , Juho Kim

Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation

Large Language Models (LLMs) have made progress in various real-world tasks, which stimulates requirements for the evaluation of LLMs. Existing LLM evaluation methods are mainly supervised signal-based which depends on static datasets and…

Computation and Language · Computer Science 2023-09-11 Jiatong Li , Rui Li , Qi Liu

Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance

The ability to communicate uncertainty, risk, and limitation is crucial for the safety of large language models. However, current evaluations of these abilities rely on simple calibration, asking whether the language generated by the model…

Computation and Language · Computer Science 2024-10-04 Kaitlyn Zhou , Jena D. Hwang , Xiang Ren , Nouha Dziri , Dan Jurafsky , Maarten Sap

Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions

Large language models (LLMs) have seen increasing popularity in enterprise applications where AI agents and humans engage in objective-driven interactions. However, these systems are difficult to evaluate: data may be complex and unlabeled;…

Machine Learning · Computer Science 2025-11-06 Emi Soroka , Tanmay Chopra , Krish Desai , Sanjay Lall

IQA-EVAL: Automatic Evaluation of Human-Model Interactive Question Answering

To evaluate Large Language Models (LLMs) for question answering (QA), traditional methods typically focus on assessing single-turn responses to given questions. However, this approach doesn't capture the dynamic nature of human-AI…

Computation and Language · Computer Science 2024-11-19 Ruosen Li , Ruochen Li , Barry Wang , Xinya Du

Who's Thinking? A Push for Human-Centered Evaluation of LLMs using the XAI Playbook

Deployed artificial intelligence (AI) often impacts humans, and there is no one-size-fits-all metric to evaluate these tools. Human-centered evaluation of AI-based systems combines quantitative and qualitative analysis and human input. It…

Human-Computer Interaction · Computer Science 2023-03-14 Teresa Datta , John P. Dickerson

Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework

The evaluation of large language models faces significant challenges. Technical benchmarks often lack real-world relevance, while existing human preference evaluations suffer from unrepresentative sampling, superficial assessment depth, and…

Computation and Language · Computer Science 2026-03-06 Nora Petrova , Andrew Gordon , Enzo Blindow

Bridging HCI and AI Research for the Evaluation of Conversational SE Assistants

As Large Language Models (LLMs) are increasingly adopted in software engineering, recently in the form of conversational assistants, ensuring these technologies align with developers' needs is essential. The limitations of traditional…

Software Engineering · Computer Science 2025-02-13 Jonan Richards , Mairieli Wessel

Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations

Improving the Theory of Mind (ToM) capability of Large Language Models (LLMs) is crucial for effective social interactions between these AI models and humans. However, the existing benchmarks often measure ToM capability improvement through…

Artificial Intelligence · Computer Science 2026-05-18 Nanxu Gong , Zixin Chen , Haotian Li , Zishu Zhao , Jianxun Lian , Huamin Qu , Yanjie Fu , Xing Xie

Human and LLM-Based Voice Assistant Interaction: An Analytical Framework for User Verbal and Nonverbal Behaviors

Recent progress in large language model (LLM) technology has significantly enhanced the interaction experience between humans and voice assistants (VAs). This project aims to explore a user's continuous interaction with LLM-based VA…

Human-Computer Interaction · Computer Science 2024-09-04 Szeyi Chan , Shihan Fu , Jiachen Li , Bingsheng Yao , Smit Desai , Mirjana Prpa , Dakuo Wang

Interactive Evaluation of Large Language Models for Multi-Requirement Software Engineering Tasks

Standard single-turn, static benchmarks fall short in evaluating the nuanced capabilities of Large Language Models (LLMs) on complex tasks such as software engineering. In this work, we propose a novel interactive evaluation framework that…

Artificial Intelligence · Computer Science 2025-08-27 Dimitrios Rontogiannis , Maxime Peyrard , Nicolas Baldwin , Martin Josifoski , Robert West , Dimitrios Gunopulos

Empowering Language Models with Active Inquiry for Deeper Understanding

The rise of large language models (LLMs) has revolutionized the way that we interact with artificial intelligence systems through natural language. However, LLMs often misinterpret user queries because of their uncertain intention, leading…

Computation and Language · Computer Science 2024-02-07 Jing-Cheng Pang , Heng-Bo Fan , Pengyuan Wang , Jia-Hao Xiao , Nan Tang , Si-Hang Yang , Chengxing Jia , Sheng-Jun Huang , Yang Yu

Assistance Without Interruption: A Benchmark and LLM-based Framework for Non-Intrusive Human-Robot Assistance

Human-robot interaction (HRI) has long studied how agents and people coordinate to achieve shared goals. In this work, we formalize and benchmark the non-intrusive assistance as an independent paradigm of HRI, where a robot proactively…

Robotics · Computer Science 2026-05-05 Yuedi Zhang , Shuanghao Bai , Wanqi Zhou , Haoran Zhang , Qi Zhang , Zhirong Luan , Badong Chen

Evaluating Efficiency and Engagement in Scripted and LLM-Enhanced Human-Robot Interactions

To achieve natural and intuitive interaction with people, HRI frameworks combine a wide array of methods for human perception, intention communication, human-aware navigation and collaborative action. In practice, when encountering…

Robotics · Computer Science 2025-01-22 Tim Schreiter , Jens V. Rüppel , Rishi Hazra , Andrey Rudenko , Martin Magnusson , Achim J. Lilienthal

HLB: Benchmarking LLMs' Humanlikeness in Language Use

As synthetic data becomes increasingly prevalent in training language models, particularly through generated dialogue, concerns have emerged that these models may deviate from authentic human language patterns, potentially losing the…

Computation and Language · Computer Science 2024-09-25 Xufeng Duan , Bei Xiao , Xuemei Tang , Zhenguang G. Cai

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

Existing benchmarks do not test Large Multimodal Models (LMMs) on their interactive intelligence with human users, which is vital for developing general-purpose AI assistants. We design InterFeedback, an interactive framework, which can be…

Computation and Language · Computer Science 2025-11-10 Henry Hengyuan Zhao , Wenqi Pei , Yifei Tao , Haiyang Mei , Mike Zheng Shou

Alignment, Exploration, and Novelty in Human-AI Interaction

Human-AI interactions are increasingly part of everyday life, yet the interpersonal dynamics that unfold during such exchanges remain underexplored. This study investigates how emotional alignment, semantic exploration, and linguistic…

Human-Computer Interaction · Computer Science 2025-12-22 Halfdan Nordahl Fundal , Johannes Eide Rambøll , Karsten Olsen

Towards Unconstrained Human-Object Interaction

Human-Object Interaction (HOI) detection is a longstanding computer vision problem concerned with predicting the interaction between humans and objects. Current HOI models rely on a vocabulary of interactions at training and inference time,…

Computer Vision and Pattern Recognition · Computer Science 2026-04-16 Francesco Tonini , Alessandro Conti , Lorenzo Vaquero , Cigdem Beyan , Elisa Ricci