English
Related papers

Related papers: InteractDiffusion: Interaction Control in Text-to-…

200 papers

Prevalent human-object interaction (HOI) detection approaches typically leverage large-scale visual-linguistic models to help recognize events involving humans and objects. Though promising, models trained via contrastive learning on…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Liulei Li , Wenguan Wang , Yi Yang

We address the problem of generating realistic 3D human-object interactions (HOIs) driven by textual prompts. To this end, we take a modular design and decompose the complex task into simpler sub-tasks. We first develop a dual-branch…

Computer Vision and Pattern Recognition · Computer Science 2025-07-08 Xiaogang Peng , Yiming Xie , Zizhao Wu , Varun Jampani , Deqing Sun , Huaizu Jiang

This paper investigates the problem of the current HOI detection methods and introduces DiffHOI, a novel HOI detection scheme grounded on a pre-trained text-image diffusion model, which enhances the detector's performance via improved data…

Computer Vision and Pattern Recognition · Computer Science 2023-05-23 Jie Yang , Bingliang Li , Fengyu Yang , Ailing Zeng , Lei Zhang , Ruimao Zhang

Recently, large-scale text-to-image (T2I) diffusion models have emerged as a powerful tool for image-to-image translation (I2I), allowing open-domain image translation via user-provided text prompts. This paper proposes frequency-controlled…

Computer Vision and Pattern Recognition · Computer Science 2025-03-28 Xiang Gao , Zhengbo Xu , Junhan Zhao , Jiaying Liu

Text-to-image (T2I) generative diffusion models have demonstrated outstanding performance in synthesizing diverse, high-quality visuals from text captions. Several layout-to-image models have been developed to control the generation process…

Computer Vision and Pattern Recognition · Computer Science 2025-02-11 Ahmad Süleyman , Göksel Biricik

Human-object interaction (HOI) detection often faces high levels of ambiguity and indeterminacy, as the same interaction can appear vastly different across different human-object pairs. Additionally, the indeterminacy can be further…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Xiaofei Hui , Haoxuan Qu , Hossein Rahmani , Jun Liu

This paper addresses a novel task of anticipating 3D human-object interactions (HOIs). Most existing research on HOI synthesis lacks comprehensive whole-body interactions with dynamic objects, e.g., often limited to manipulating small or…

Computer Vision and Pattern Recognition · Computer Science 2023-09-01 Sirui Xu , Zhengyuan Li , Yu-Xiong Wang , Liang-Yan Gui

In the rapidly advancing realm of visual generation, diffusion models have revolutionized the landscape, marking a significant shift in capabilities with their impressive text-guided generative functions. However, relying solely on text for…

Computer Vision and Pattern Recognition · Computer Science 2026-01-09 Pu Cao , Feng Zhou , Qing Song , Lu Yang

Large-scale text-to-image diffusion models have been a revolutionary milestone in the evolution of generative AI and multimodal technology, allowing wonderful image generation with natural-language text prompt. However, the issue of lacking…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Xiang Gao , Jiaying Liu

We present HOIDiNi, a text-driven diffusion framework for synthesizing realistic and plausible human-object interaction (HOI). HOI generation is extremely challenging since it induces strict contact accuracies alongside a diverse motion…

Computer Vision and Pattern Recognition · Computer Science 2025-10-22 Roey Ron , Guy Tevet , Haim Sawdayee , Amit H. Bermano

Image editing aims to edit the given synthetic or real image to meet the specific requirements from users. It is widely studied in recent years as a promising and challenging field of Artificial Intelligence Generative Content (AIGC).…

Computer Vision and Pattern Recognition · Computer Science 2024-06-21 Xincheng Shuai , Henghui Ding , Xingjun Ma , Rongcheng Tu , Yu-Gang Jiang , Dacheng Tao

This paper addresses new methodologies to deal with the challenging task of generating dynamic Human-Object Interactions from textual descriptions (Text2HOI). While most existing works assume interactions with limited body parts or static…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Qianyang Wu , Ye Shi , Xiaoshui Huang , Jingyi Yu , Lan Xu , Jingya Wang

Text-driven Human-Object Interaction (Text-to-HOI) generation is an emerging field with applications in animation, video games, virtual reality, and robotics. A key challenge in HOI generation is maintaining interaction consistency in long…

Graphics · Computer Science 2025-03-24 Zichen Geng , Zeeshan Hayder , Wei Liu , Ajmal Saeed Mian

We propose a diffusion-based approach for Text-to-Image (T2I) generation with interactive 3D layout control. Layout control has been widely studied to alleviate the shortcomings of T2I diffusion models in understanding objects' placement…

Computer Vision and Pattern Recognition · Computer Science 2024-08-28 Abdelrahman Eldesokey , Peter Wonka

The proliferation of text-to-image diffusion models (T2I DMs) has led to an increased presence of AI-generated images in daily life. However, biased T2I models can generate content with specific tendencies, potentially influencing people's…

Computer Vision and Pattern Recognition · Computer Science 2025-04-03 Huayang Huang , Xiangye Jin , Jiaxu Miao , Yu Wu

3D hand-object interaction data is scarce due to the hardware constraints in scaling up the data collection process. In this paper, we propose HOIDiffusion for generating realistic and diverse 3D hand-object interaction data. Our model is a…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Mengqi Zhang , Yang Fu , Zheng Ding , Sifei Liu , Zhuowen Tu , Xiaolong Wang

Modeling 3D human-object interaction (HOI) is a problem of great interest for computer vision and a key enabler for virtual and mixed-reality applications. Existing methods work in a one-way direction: some recover plausible human…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Ilya A. Petrov , Riccardo Marin , Julian Chibane , Gerard Pons-Moll

Large-scale diffusion models have achieved state-of-the-art results on text-to-image synthesis (T2I) tasks. Despite their ability to generate high-quality yet creative images, we observe that attribution-binding and compositional…

Computer Vision and Pattern Recognition · Computer Science 2023-03-02 Weixi Feng , Xuehai He , Tsu-Jui Fu , Varun Jampani , Arjun Akula , Pradyumna Narayana , Sugato Basu , Xin Eric Wang , William Yang Wang

Diffusion models revolutionize image generation by leveraging natural language to guide the creation of multimedia content. Despite significant advancements in such generative models, challenges persist in depicting detailed human-object…

We propose a diffusion-based approach for Text-to-Image (T2I) generation with consistent and interactive 3D layout control and editing. While prior methods improve spatial adherence using 2D cues or iterative copy-warp-paste strategies,…

Computer Vision and Pattern Recognition · Computer Science 2026-01-21 Andrea Rigo , Luca Stornaiuolo , Weijie Wang , Mauro Martino , Bruno Lepri , Nicu Sebe
‹ Prev 1 2 3 10 Next ›