English
Related papers

Related papers: Action sequencing using visual permutations

200 papers

Humans are able to seamlessly visually imitate others, by inferring their intentions and using past experience to achieve the same end goal. In other words, we can parse complex semantic knowledge from raw video and efficiently translate…

Machine Learning · Computer Science 2020-11-12 Sudeep Dasari , Abhinav Gupta

Humans are excellent at understanding language and vision to accomplish a wide range of tasks. In contrast, creating general instruction-following embodied agents remains a difficult challenge. Prior work that uses pure language-only models…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Hao Liu , Lisa Lee , Kimin Lee , Pieter Abbeel

Recognizing the actions of others from visual stimuli is a crucial aspect of human visual perception that allows individuals to respond to social cues. Humans are able to identify similar behaviors and discriminate between distinct actions…

Neurons and Cognition · Quantitative Biology 2018-02-07 Andrea Tacchetti , Leyla Isik , Tomaso Poggio

Prospection, the act of predicting the consequences of many possible futures, is intrinsic to human planning and action, and may even be at the root of consciousness. Surprisingly, this idea has been explored comparatively little in…

Robotics · Computer Science 2018-04-03 Chris Paxton , Yotam Barnoy , Kapil Katyal , Raman Arora , Gregory D. Hager

Recent literature in the robotics community has focused on learning robot behaviors that abstract out lower-level details of robot control. To fully leverage the efficacy of such behaviors, it is necessary to select and sequence them to…

Agents capable of reasoning and planning in the real world require the ability of predicting the consequences of their actions. While world models possess this capability, they most often require action labels, that can be complex to obtain…

Artificial Intelligence · Computer Science 2026-01-21 Quentin Garrido , Tushar Nagarajan , Basile Terver , Nicolas Ballas , Yann LeCun , Michael Rabbat

The field of visual representation learning has seen explosive growth in the past years, but its benefits in robotics have been surprisingly limited so far. Prior work uses generic visual representations as a basis to learn (task-specific)…

Robotics · Computer Science 2023-08-16 Jianren Wang , Sudeep Dasari , Mohan Kumar Srirama , Shubham Tulsiani , Abhinav Gupta

Over the last years, state-tracking tasks, particularly permutation composition, have become a testbed to understand the limits of sequence models architectures like Transformers and RNNs (linear and non-linear). However, these are often…

Machine Learning · Computer Science 2026-04-24 Julien Siems , Riccardo Grazzi , Kirill Kalinin , Hitesh Ballani , Babak Rahmani

This work aims to learn how to perform complex robot manipulation tasks that are composed of several, consecutively executed low-level sub-tasks, given as input a few visual demonstrations of the tasks performed by a person. The sub-tasks…

Robotics · Computer Science 2022-03-09 Junchi Liang , Bowen Wen , Kostas Bekris , Abdeslam Boularias

Neural networks have achieved success in a wide array of perceptual tasks but often fail at tasks involving both perception and higher-level reasoning. On these more challenging tasks, bespoke approaches (such as modular symbolic…

Computer Vision and Pattern Recognition · Computer Science 2021-10-27 David Ding , Felix Hill , Adam Santoro , Malcolm Reynolds , Matt Botvinick

Visual perception and language understanding are - fundamental components of human intelligence, enabling them to understand and reason about objects and their interactions. It is crucial for machines to have this capacity to reason using…

Computer Vision and Pattern Recognition · Computer Science 2022-09-27 Thao Minh Le

Action recognition and anticipation are key to the success of many computer vision applications. Existing methods can roughly be grouped into those that extract global, context-aware representations of the entire image or sequence, and…

Computer Vision and Pattern Recognition · Computer Science 2016-11-21 Mohammad Sadegh Aliakbarian , Fatemehsadat Saleh , Basura Fernando , Mathieu Salzmann , Lars Petersson , Lars Andersson

Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge. Although continual learning has been widely studied in computer vision, its application to Vision+Language tasks is not…

Machine Learning · Computer Science 2024-01-23 Mavina Nikandrou , Lu Yu , Alessandro Suglia , Ioannis Konstas , Verena Rieser

Recently developed pretrained models can encode rich world knowledge expressed in multiple modalities, such as text and images. However, the outputs of these models cannot be integrated into algorithms to solve sequential decision-making…

Artificial Intelligence · Computer Science 2024-06-19 Yunhao Yang , Cyrus Neary , Ufuk Topcu

The ability to sequence unordered events is an essential skill to comprehend and reason about real world task procedures, which often requires thorough understanding of temporal common sense and multimodal information, as these procedures…

Computation and Language · Computer Science 2024-02-22 Te-Lin Wu , Alex Spangher , Pegah Alipoormolabashi , Marjorie Freedman , Ralph Weischedel , Nanyun Peng

A key challenge in scaling up robot learning to many skills and environments is removing the need for human supervision, so that robots can collect their own data and improve their own performance without being limited by the cost of…

Machine Learning · Computer Science 2017-03-14 Chelsea Finn , Sergey Levine

Machine learning models of visual action recognition are typically trained and tested on data from specific situations where actions are associated with certain objects. It is an open question how action-object associations in the training…

Computer Vision and Pattern Recognition · Computer Science 2022-08-16 Satoshi Tsutsui , Xizi Wang , Guangyuan Weng , Yayun Zhang , David Crandall , Chen Yu

In complex systems, we often observe complex global behavior emerge from a collection of agents interacting with each other in their environment, with each individual agent acting only on locally available information, without knowing the…

Neural and Evolutionary Computing · Computer Science 2021-09-30 Yujin Tang , David Ha

Various human activities can be abstracted into a sequence of actions in natural text, i.e. cooking, repairing, manufacturing, etc. Such action sequences heavily depend on the executing order, while disorder in action sequences leads to…

Computation and Language · Computer Science 2023-06-08 Weizhi Wang , Hong Wang , Xifeng Yan

In vision-based action recognition, spatio-temporal features from different modalities are used for recognizing activities. Temporal modeling is a long challenge of action recognition. However, there are limited methods such as pre-computed…

Computer Vision and Pattern Recognition · Computer Science 2023-02-06 Elham Shabaninia , Hossein Nezamabadi-pour , Fatemeh Shafizadegan
‹ Prev 1 2 3 10 Next ›