English
Related papers

Related papers: Do oral messages help visual search?

200 papers

Input multimodality combining speech and hand gestures has motivated numerous usability studies. Contrastingly, issues relating to the design and ergonomic evaluation of multimodal output messages combining speech with visual modalities…

Human-Computer Interaction · Computer Science 2007-09-05 Suzanne Kieffer , Noëlle Carbonell

This paper describes an experimental study that aims at assessing the actual contribution of voice system messages to visual search efficiency and comfort. Messages which include spatial information on the target location are meant to…

Human-Computer Interaction · Computer Science 2007-10-04 Suzanne Kieffer , Noëlle Carbonell

The main aim of the work presented here is to contribute to computer science advances in the multimodal usability area, in-as-much as it addresses one of the major issues relating to the generation of effective oral system messages: how to…

Human-Computer Interaction · Computer Science 2007-08-28 Suzanne Kieffer , Noëlle Carbonell

Humans sense of distance depends on the integration of multi sensory cues. The incoming visual luminance, auditory pitch and tactile vibration could all contribute to the ability of distance judgement. This ability can be enhanced if the…

Human-Computer Interaction · Computer Science 2020-02-18 Feng Feng , Tony Stockman

Individuals, despite having varied life experiences and learning processes, can communicate effectively through languages. This study aims to explore the efficiency of language as a communication medium. We put forth two specific…

Machine Learning · Computer Science 2024-10-21 Hang Chen , Yuchuan Jang , Weijie Zhou , Cristian Meo , Ziwei Chen , Dianbo Liu

We present an introspection of an audiovisual speech enhancement model. In particular, we focus on interpreting how a neural audiovisual speech enhancement model uses visual cues to improve the quality of the target speech signal. We show…

Conversational search systems increasingly employ clarifying questions to refine user queries and improve the search experience. Previous studies have demonstrated the usefulness of text-based clarifying questions in enhancing both…

Computation and Language · Computer Science 2026-02-10 Clemencia Siro , Zahra Abbasiantaeb , Yifei Yuan , Mohammad Aliannejadi , Maarten de Rijke

Augmented and mixed-reality techniques harbor a great potential for improving human-robot collaboration. Visual signals and cues may be projected to a human partner in order to explicitly communicate robot intentions and goals. However, it…

Robotics · Computer Science 2023-08-22 Shubham Sonawani , Yifan Zhou , Heni Ben Amor

Selection of occluded objects is a challenging problem in virtual reality, even more so if multiple objects are involved. With the advent of new artificial intelligence technologies, we explore the possibility of leveraging large language…

Human-Computer Interaction · Computer Science 2024-10-29 Junlong Chen , Jens Grubert , Per Ola Kristensson

In order for robots to operate effectively in homes and workplaces, they must be able to manipulate the articulated objects common within environments built for and by humans. Previous work learns kinematic models that prescribe this…

Robotics · Computer Science 2016-07-04 Zhengyang Wu , Mohit Bansal , Matthew R. Walter

Three types of video surrogates - visual (keyframes), verbal (keywords/phrases), and combination of the two - were designed and studied in a qualitative investigation of user cognitive processes. The results favor the combined surrogates in…

Digital Libraries · Computer Science 2007-05-23 Wei Ding , Gary Marchionini , Dagobert Soergel

Multi-modal learning, particularly among imaging and linguistic modalities, has made amazing strides in many high-level fundamental visual understanding problems, ranging from language grounding to dense event captioning. However, much of…

Computer Vision and Pattern Recognition · Computer Science 2019-10-28 Tanzila Rahman , Bicheng Xu , Leonid Sigal

Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data. The most prominent tasks in this area…

Computation and Language · Computer Science 2019-12-02 Umut Sulubacak , Ozan Caglayan , Stig-Arne Grönroos , Aku Rouhe , Desmond Elliott , Lucia Specia , Jörg Tiedemann

Interaction plays a vital role during visual network exploration as users need to engage with both elements in the view (e.g., nodes, links) and interface controls (e.g., sliders, dropdown menus). Particularly as the size and complexity of…

Human-Computer Interaction · Computer Science 2020-05-01 Ayshwarya Saktheeswaran , Arjun Srinivasan , John Stasko

Large language models have demonstrated robust performance on various language tasks using zero-shot or few-shot learning paradigms. While being actively researched, multimodal models that can additionally handle images as input have yet to…

Computation and Language · Computer Science 2023-05-24 Sherzod Hakimov , David Schlangen

Object referring has important applications, especially for human-machine interaction. While having received great attention, the task is mainly attacked with written language (text) as input rather than spoken language (speech), which is…

Computer Vision and Pattern Recognition · Computer Science 2017-12-06 Arun Balajee Vasudevan , Dengxin Dai , Luc Van Gool

Multi-modal word semantics aims to enhance embeddings with perceptual input, assuming that human meaning representation is grounded in sensory experience. Most research focuses on evaluation involving direct visual input, however, visual…

Computation and Language · Computer Science 2021-10-07 Anita L. Verő , Ann Copestake

The potential of multimodal generative artificial intelligence (mAI) to replicate human grounded language understanding, including the pragmatic, context-rich aspects of communication, remains to be clarified. Humans are known to use…

When searching for an object humans navigate through a scene using semantic information and spatial relationships. We look for an object using our knowledge of its attributes and relationships with other objects to infer the probable…

Computer Vision and Pattern Recognition · Computer Science 2018-12-18 Jean-Benoit Delbrouck , Stéphane Dupont

Despite the impressive advancements achieved through vision-and-language pretraining, it remains unclear whether this joint learning paradigm can help understand each individual modality. In this work, we conduct a comparative analysis of…

Computer Vision and Pattern Recognition · Computer Science 2024-01-31 Zhuowan Li , Cihang Xie , Benjamin Van Durme , Alan Yuille
‹ Prev 1 2 3 10 Next ›