Related papers: Semantic Relational Object Tracking
Robotic agents should be able to learn from sub-symbolic sensor data, and at the same time, be able to reason about objects and communicate with humans on a symbolic level. This raises the question of how to overcome the gap between…
Reliable perception is essential for robots that interact with the world. But sensors alone are often insufficient to provide this capability, and they are prone to errors due to various conditions in the environment. Furthermore, there is…
Bridging the physical and digital world through interaction remains a core challenge in augmented reality (AR). Existing systems target single objects, limiting support for planning, comparison, and assembly tasks that depend on…
Object-centric world models provide structured representation of the scene and can be an important backbone in reinforcement learning and planning. However, existing approaches suffer in partially-observable environments due to the lack of…
Object rearrangement has recently emerged as a key competency in robot manipulation, with practical solutions generally involving object detection, recognition, grasping and high-level planning. Goal-images describing a desired scene…
A more realistic object detection paradigm, Open-World Object Detection, has arisen increasing research interests in the community recently. A qualified open-world object detector can not only identify objects of known categories, but also…
Reasoning about spatial relationships between objects is essential for many real-world robotic tasks, such as fetch-and-delivery, object rearrangement, and object search. The ability to detect and disambiguate different objects and identify…
To determine the 3D orientation and 3D location of objects in the surroundings of a camera mounted on a robot or mobile device, we developed two powerful algorithms in object detection and temporal tracking that are combined seamlessly for…
Tracking by detection, the dominant approach for online multi-object tracking, alternates between localization and association steps. As a result, it strongly depends on the quality of instantaneous observations, often failing when objects…
Event-based object detection has recently garnered attention in the computer vision community due to the exceptional properties of event cameras, such as high dynamic range and no motion blur. However, feature asynchronism and sparsity…
This paper addresses multi-object systems, where objects may occlude one another relative to the sensor. The standard point-object model for detection-based sensors is enhanced so that the probability of detection considers the presence of…
A natural way to improve the detection of objects is to consider the contextual constraints imposed by the detection of additional objects in a given scene. In this work, we exploit the spatial relations between objects in order to improve…
Symbolic anchoring is a crucial problem in the field of robotics, as it enables robots to obtain symbolic knowledge from the perceptual information acquired through their sensors. In cognitive-based robots, this process of processing…
Loop closure can effectively correct the accumulated error in robot localization, which plays a critical role in the long-term navigation of the robot. Traditional appearance-based methods rely on local features and are prone to failure in…
Intelligent robots require object-level scene understanding to reason about possible tasks and interactions with the environment. Moreover, many perception tasks such as scene reconstruction, image retrieval, or place recognition can…
To accomplish tasks in human-centric indoor environments, robots need to represent and understand the world in terms of objects and their attributes. We refer to this attribute-based representation as a world model, and consider how to…
The problem of searching for a model-based scene interpretation is analyzed within a probabilistic framework. Object models are formulated as generative models for range data of the scene. A new statistical criterion, the truncated object…
A robot's ability to understand or ground natural language instructions is fundamentally tied to its knowledge about the surrounding world. We present an approach to grounding natural language utterances in the context of factual…
Modeling instance-level context and object-object relationships is extremely challenging. It requires reasoning about bounding boxes of different classes, locations \etc. Above all, instance-level spatial reasoning inherently requires…
We present a filtering-based method for semantic mapping to simultaneously detect objects and localize their 6 degree-of-freedom pose. For our method, called Contextual Temporal Mapping (or CT-Map), we represent the semantic map as a belief…