Related papers: Gaze as a Supplementary Modality for Interacting w…
Recent advancements in eye tracking technology are driving the adoption of gaze-assisted interaction as a rich and accessible human-computer interaction paradigm. Gaze-assisted interaction serves as a contextual, non-invasive, and explicit…
In recent years we have witnessed an increasing number of interactive systems on handheld mobile devices which utilise gaze as a single or complementary interaction modality. This trend is driven by the enhanced computational power of these…
Gaze-based interactions offer a potential way for users to naturally engage with mixed reality (XR) interfaces. Black-box machine learning models enabled higher accuracy for gaze-based interactions. However, due to the black-box nature of…
Gaze input, as a modality inherently conveying user intent, offers intuitive and immersive experiences in extended reality (XR). With eye-tracking now being a standard feature in modern XR headsets, gaze has been extensively applied to…
Appearance-based gaze estimation methods that only require an off-the-shelf camera have significantly improved but they are still not yet widely used in the human-computer interaction (HCI) community. This is partly because it remains…
Although eye-tracking technology is being integrated into more VR and MR headsets, the true potential of eye tracking in enhancing user interactions within XR settings remains relatively untapped. Presently, one of the most prevalent gaze…
The eyes play an important role in human collaboration. Mutual and shared gaze help communicate visual attention to each other or to a specific object of interest. Shared gaze was typically investigated for pair collaborations in remote…
Gaze reflects how humans process visual scenes and is therefore increasingly used in computer vision systems. Previous works demonstrated the potential of gaze for object-centric tasks, such as object localization and recognition, but it…
Avatars on displays lack the ability to engage with the physical environment through gaze. To address this limitation, we propose a gaze synthesis method that enables animated avatars to establish gaze communication with the physical…
Gaze following estimates gaze targets of in-scene person by understanding human behavior and scene information. Existing methods usually analyze scene images for gaze following. However, compared with visual images, audio also provides…
Advanced multimodal AI agents can now collaborate with users to solve challenges in the world. Yet, these emerging contextual AI systems rely on explicit communication channels between the user and system. We hypothesize that implicit…
The potential of gaze for hands-free mobile interaction is increasingly evident. While each gaze input technique presents distinct advantages and limitations, a combination can amplify strengths and mitigate challenges. We report on the…
In this study, we investigated gaze-based interaction methods within a virtual reality game with a visual search task with 52 participants. We compared four different interaction techniques: Selection by dwell time or confirmation of…
Modern information querying systems are progressively incorporating multimodal inputs like vision and audio. However, the integration of gaze -- a modality deeply linked to user intent and increasingly accessible via gaze-tracking wearables…
Video conferences play a vital role in our daily lives. However, many nonverbal cues are missing, including gaze and spatial information. We introduce LookAtChat, a web-based video conferencing system, which empowers remote users to…
Current LLM assistants are powerful at answering questions, but they have limited access to the behavioral context that reveals when and where a user is struggling. We present a gaze-grounded multimodal LLM assistant that uses egocentric…
Vision-based Interfaces (VIs) are pivotal in advancing Human-Computer Interaction (HCI), particularly in enhancing context awareness. However, there are significant opportunities for these interfaces due to rapid advancements in multimodal…
With the rise in popularity of smart glasses, users' attention has been integrated into Vision-Language Models (VLMs) to streamline multi-modal querying in daily scenarios. However, leveraging gaze data to model users' attention may…
Augmented Reality (AR) see-through vision is an interesting research topic since it enables users to see through a wall and see the occluded objects. Most existing research focuses on the visual effects of see-through vision, while the…
Eye gaze analysis is an important research problem in the field of Computer Vision and Human-Computer Interaction. Even with notable progress in the last 10 years, automatic gaze analysis still remains challenging due to the uniqueness of…