Related papers: Task-Oriented Human-Object Interactions Generation…

IMos: Intent-Driven Full-Body Motion Synthesis for Human-Object Interactions

Can we make virtual characters in a scene interact with their surrounding objects through simple instructions? Is it possible to synthesize such motion plausibly with a diverse set of objects and instructions? Inspired by these questions,…

Computer Vision and Pattern Recognition · Computer Science 2023-02-28 Anindita Ghosh , Rishabh Dabral , Vladislav Golyanik , Christian Theobalt , Philipp Slusallek

Object Motion Guided Human Motion Synthesis

Modeling human behaviors in contextual environments has a wide range of applications in character animation, embodied AI, VR/AR, and robotics. In real-world scenarios, humans frequently interact with the environment and manipulate various…

Computer Vision and Pattern Recognition · Computer Science 2023-09-29 Jiaman Li , Jiajun Wu , C. Karen Liu

HO-Flow: Generalizable Hand-Object Interaction Generation with Latent Flow Matching

Generating realistic 3D hand-object interactions (HOI) is a fundamental challenge in computer vision and robotics, requiring both temporal coherence and high-fidelity physical plausibility. Existing methods remain limited in their ability…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Zerui Chen , Rolandos Alexandros Potamias , Shizhe Chen , Jiankang Deng , Cordelia Schmid , Stefanos Zafeiriou

Learning to Generate Human-Human-Object Interactions from Textual Descriptions

The way humans interact with each other, including interpersonal distances, spatial configuration, and motion, varies significantly across different situations. To enable machines to understand such complex, context-dependent behaviors, it…

Computer Vision and Pattern Recognition · Computer Science 2025-12-25 Jeonghyeon Na , Sangwon Baik , Inhee Lee , Junyoung Lee , Hanbyul Joo

TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation

Human-human motion generation is essential for understanding humans as social beings. Current methods fall into two main categories: single-person-based methods and separate modeling-based methods. To delve into this field, we abstract the…

Computer Vision and Pattern Recognition · Computer Science 2026-03-11 Yabiao Wang , Shuo Wang , Jiangning Zhang , Ke Fan , Jiafu Wu , Zhucun Xue , Yong Liu

Modeling Dynamic Hand-Object Interactions with Applications to Human-Robot Handovers

Humans frequently grasp, manipulate, and move objects. Interactive systems assist humans in these tasks, enabling applications in Embodied AI, human-robot interaction, and virtual reality. However, current methods in hand-object synthesis…

Robotics · Computer Science 2025-03-10 Sammy Christen

Text-driven Motion Generation: Overview, Challenges and Directions

Text-driven motion generation offers a powerful and intuitive way to create human movements directly from natural language. By removing the need for predefined motion inputs, it provides a flexible and accessible approach to controlling…

Computer Vision and Pattern Recognition · Computer Science 2025-05-15 Ali Rida Sahili , Najett Neji , Hedi Tabia

SynH2R: Synthesizing Hand-Object Motions for Learning Human-to-Robot Handovers

Vision-based human-to-robot handover is an important and challenging task in human-robot interaction. Recent work has attempted to train robot policies by interacting with dynamic virtual humans in simulated environments, where the policies…

Robotics · Computer Science 2025-01-03 Sammy Christen , Lan Feng , Wei Yang , Yu-Wei Chao , Otmar Hilliges , Jie Song

CG-HOI: Contact-Guided 3D Human-Object Interaction Generation

We propose CG-HOI, the first method to address the task of generating dynamic 3D human-object interactions (HOIs) from text. We model the motion of both human and object in an interdependent fashion, as semantically rich human motion rarely…

Computer Vision and Pattern Recognition · Computer Science 2024-05-20 Christian Diller , Angela Dai

TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding

Humans commonly work with multiple objects in daily life and can intuitively transfer manipulation skills to novel objects by understanding object functional regularities. However, existing technical approaches for analyzing and…

Computer Vision and Pattern Recognition · Computer Science 2024-03-26 Yun Liu , Haolin Yang , Xu Si , Ling Liu , Zipeng Li , Yuxiang Zhang , Yebin Liu , Li Yi

Text2HOI: Text-guided 3D Motion Generation for Hand-Object Interaction

This paper introduces the first text-guided work for generating the sequence of hand-object interaction in 3D. The main challenge arises from the lack of labeled data where existing ground-truth datasets are nowhere near generalizable in…

Computer Vision and Pattern Recognition · Computer Science 2024-04-03 Junuk Cha , Jihyeon Kim , Jae Shin Yoon , Seungryul Baek

Decoupled Generative Modeling for Human-Object Interaction Synthesis

Synthesizing realistic human-object interaction (HOI) is essential for 3D computer vision and robotics, underpinning animation and embodied control. Existing approaches often require manually specified intermediate waypoints and place all…

Computer Vision and Pattern Recognition · Computer Science 2026-04-14 Hwanhee Jung , Seunggwan Lee , Jeongyoon Yoon , SeungHyeon Kim , Giljoo Nam , Qixing Huang , Sangpil Kim

InteractMove: Text-Controlled Human-Object Interaction Generation in 3D Scenes with Movable Objects

We propose a novel task of text-controlled human object interaction generation in 3D scenes with movable objects. Existing human-scene interaction datasets suffer from insufficient interaction categories and typically only consider…

Computer Vision and Pattern Recognition · Computer Science 2025-09-30 Xinhao Cai , Minghang Zheng , Xin Jin , Yang Liu

UniHM: Universal Human Motion Generation with Object Interactions in Indoor Scenes

Human motion synthesis in complex scenes presents a fundamental challenge, extending beyond conventional Text-to-Motion tasks by requiring the integration of diverse modalities such as static environments, movable objects, natural language…

Graphics · Computer Science 2025-05-20 Zichen Geng , Zeeshan Hayder , Wei Liu , Ajmal Mian

VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification

Synthesizing realistic human-object interactions (HOI) in video is challenging due to the complex, instance-specific interaction dynamics of both humans and objects. Incorporating controllability in video generation further adds to the…

Computer Vision and Pattern Recognition · Computer Science 2026-04-09 Wanyue Zhang , Lin Geng Foo , Thabo Beeler , Rishabh Dabral , Christian Theobalt

DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions

Generating natural hand-object interactions in 3D is challenging as the resulting hand and object motions are expected to be physically plausible and semantically meaningful. Furthermore, generalization to unseen objects is hindered by the…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Sammy Christen , Shreyas Hampali , Fadime Sener , Edoardo Remelli , Tomas Hodan , Eric Sauser , Shugao Ma , Bugra Tekin

Scene-aware Generative Network for Human Motion Synthesis

We revisit human motion synthesis, a task useful in various real world applications, in this paper. Whereas a number of methods have been developed previously for this task, they are often limited in two aspects: focusing on the poses while…

Computer Vision and Pattern Recognition · Computer Science 2021-06-01 Jingbo Wang , Sijie Yan , Bo Dai , Dahua LIn

TOUCH: Text-guided Controllable Generation of Free-Form Hand-Object Interactions

Hand-object interaction (HOI) is fundamental for humans to express intent. Existing HOI generation research is predominantly confined to fixed grasping patterns, where control is tied to physical priors such as force closure or generic…

Computer Vision and Pattern Recognition · Computer Science 2025-10-17 Guangyi Han , Wei Zhai , Yuhang Yang , Yang Cao , Zheng-Jun Zha

Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis

Real-time synthesis of physically plausible human interactions remains a critical challenge for immersive VR/AR systems and humanoid robotics. While existing methods demonstrate progress in kinematic motion generation, they often fail to…

Computer Vision and Pattern Recognition · Computer Science 2025-08-05 Kaiyang Ji , Ye Shi , Zichen Jin , Kangyi Chen , Lan Xu , Yuexin Ma , Jingyi Yu , Jingya Wang

SimGenHOI: Physically Realistic Whole-Body Humanoid-Object Interaction via Generative Modeling and Reinforcement Learning

Generating physically realistic humanoid-object interactions (HOI) is a fundamental challenge in robotics. Existing HOI generation approaches, such as diffusion-based models, often suffer from artifacts such as implausible contacts,…

Robotics · Computer Science 2025-08-21 Yuhang Lin , Yijia Xie , Jiahong Xie , Yuehao Huang , Ruoyu Wang , Jiajun Lv , Yukai Ma , Xingxing Zuo