English
Related papers

Related papers: EC-Diffuser: Multi-Object Manipulation via Entity-…

200 papers

Imitation learning addresses the challenge of learning by observing an expert's demonstrations without access to reward signals from environments. Most existing imitation learning methods that do not require interacting with environments…

Machine Learning · Computer Science 2024-06-04 Shang-Fu Chen , Hsiang-Chun Wang , Ming-Hao Hsu , Chun-Mao Lai , Shao-Hua Sun

Behavior Cloning (BC) methods are effective at learning complex manipulation tasks. However, they are prone to spurious correlation - expressive models may focus on distractors that are irrelevant to action prediction - and are thus fragile…

Robotics · Computer Science 2024-08-28 Vaibhav Saxena , Yotto Koga , Danfei Xu

Generative Behavior Cloning (GBC) is a simple yet effective framework for robot learning, particularly in multi-task settings. Recent GBC methods often employ diffusion policies with open-loop (OL) control, where actions are generated via a…

Robotics · Computer Science 2025-10-15 Junhyuk So , Chiwoong Lee , Shinyoung Lee , Jungseul Ok , Eunhyeok Park

Object-centric learning aims to represent visual data with a set of object entities (a.k.a. slots), providing structured representations that enable systematic generalization. Leveraging advanced architectures like Transformers, recent…

Computer Vision and Pattern Recognition · Computer Science 2023-09-25 Ziyi Wu , Jingyu Hu , Wuyue Lu , Igor Gilitschenski , Animesh Garg

Behavior cloning (BC) has become a staple imitation learning paradigm in robotics due to its ease of teaching robots complex skills directly from expert demonstrations. However, BC suffers from an inherent generalization issue. To solve…

Robotics · Computer Science 2025-08-12 Tianyu Li , Sunan Sun , Shubhodeep Shiv Aditya , Nadia Figueroa

Recently, diffusion transformers have gained wide attention with its excellent performance in text-to-image and text-to-vidoe models, emphasizing the need for transformers as backbone for diffusion models. Transformer-based models have…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Nithin Gopalakrishnan Nair , Jeya Maria Jose Valanarasu , Vishal M. Patel

Detecting objects seamlessly blended into their surroundings represents a complex task for both human cognitive capabilities and advanced artificial intelligence algorithms. Currently, the majority of methodologies for detecting camouflaged…

Computer Vision and Pattern Recognition · Computer Science 2024-07-19 Jianwei Zhao , Xin Li , Fan Yang , Qiang Zhai , Ao Luo , Zicheng Jiao , Hong Cheng

Learning from a large corpus of data, pre-trained models have achieved impressive progress nowadays. As popular generative pre-training, diffusion models capture both low-level visual knowledge and high-level semantic relations. In this…

Computer Vision and Pattern Recognition · Computer Science 2023-03-20 Chaofan Ma , Yuhuan Yang , Chen Ju , Fei Zhang , Jinxiang Liu , Yu Wang , Ya Zhang , Yanfeng Wang

Diffusion models (DMs) have emerged as a promising approach for behavior cloning (BC). Diffusion policies (DP) based on DMs have elevated BC performance to new heights, demonstrating robust efficacy across diverse tasks, coupled with their…

Computer Vision and Pattern Recognition · Computer Science 2024-05-31 Yipu Chen , Haotian Xue , Yongxin Chen

Robot learning tasks are extremely compute-intensive and hardware-specific. Thus the avenues of tackling these challenges, using a diverse dataset of offline demonstrations that can be used to train robot manipulation agents, is very…

Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further…

Computer Vision and Pattern Recognition · Computer Science 2023-04-21 Ziqi Huang , Kelvin C. K. Chan , Yuming Jiang , Ziwei Liu

Diffusion models have emerged as powerful generative models in the text-to-image domain. This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments. Human behaviour is…

Diffusion models have emerged as a powerful generative technology and have been found to be applicable in various scenarios. Most existing foundational diffusion models are primarily designed for text-guided visual generation and do not…

Computer Vision and Pattern Recognition · Computer Science 2024-11-06 Zhen Han , Zeyinzi Jiang , Yulin Pan , Jingfeng Zhang , Chaojie Mao , Chenwei Xie , Yu Liu , Jingren Zhou

End-to-end learning is emerging as a powerful paradigm for robotic manipulation, but its effectiveness is limited by data scarcity and the heterogeneity of action spaces across robot embodiments. In particular, diverse action spaces across…

Robotics · Computer Science 2026-03-23 Erik Bauer , Elvis Nava , Robert K. Katzschmann

In the field of Robot Learning, the complex mapping between high-dimensional observations such as RGB images and low-level robotic actions, two inherently very different spaces, constitutes a complex learning problem, especially with…

Robotics · Computer Science 2024-05-29 Vitalis Vosylius , Younggyo Seo , Jafar Uruç , Stephen James

Intelligent agents, such as robots and virtual agents, must understand the dynamics of complex social interactions to interact with humans. Effectively representing social dynamics is challenging because we require multi-modal, synchronized…

Machine Learning · Computer Science 2025-01-22 Antonio Lech Martin-Ozimek , Isuru Jayarathne , Su Larb Mon , Jouh Yeong Chew

Data imputation and data generation have important applications for many domains, like healthcare and finance, where incomplete or missing data can hinder accurate analysis and decision-making. Diffusion models have emerged as powerful…

Machine Learning · Computer Science 2025-06-10 Mario Villaizán-Vallelado , Matteo Salvatori , Carlos Segura , Ioannis Arapakis

In computer vision, it is well-known that a lack of data diversity will impair model performance. In this study, we address the challenges of enhancing the dataset diversity problem in order to benefit various downstream tasks such as…

Computer Vision and Pattern Recognition · Computer Science 2024-08-02 Yuhang Li , Xin Dong , Chen Chen , Weiming Zhuang , Lingjuan Lyu

Image clustering is a crucial but challenging task in multimedia machine learning. Recently the combination of clustering with deep learning has achieved promising performance against conventional methods on high-dimensional image data.…

Computer Vision and Pattern Recognition · Computer Science 2026-03-31 Ruilin Zhang , Haiyang Zheng , Hongpeng Wang

Embodied visual planning aims to enable manipulation tasks by imagining how a scene evolves toward a desired goal and using the imagined trajectories to guide actions. Video diffusion models, through their image-to-video generation…

Computer Vision and Pattern Recognition · Computer Science 2025-12-30 Yuming Gu , Yizhi Wang , Yining Hong , Yipeng Gao , Hao Jiang , Angtian Wang , Bo Liu , Nathaniel S. Dennler , Zhengfei Kuang , Hao Li , Gordon Wetzstein , Chongyang Ma
‹ Prev 1 2 3 10 Next ›