Related papers: IM2CAD
We address the task of aligning CAD models to a video sequence of a complex scene containing multiple objects. Our method can process arbitrary videos and fully automatically recover the 9 DoF pose for each object appearing in it, thus…
By moving a depth sensor around a room, we compute a 3D CAD model of the environment, capturing the room shape and contents such as chairs, desks, sofas, and tables. Rather than reconstructing geometry, we match, place, and align each…
We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors. Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of…
Dense indoor scene modeling from 2D images has been bottlenecked due to the absence of depth information and cluttered occlusions. We present an automatic indoor scene modeling approach using deep features from neural networks. Given a…
We present Scan2CAD, a novel data-driven method that learns to align clean 3D CAD models from a shape database to the noisy and incomplete geometry of a commodity RGB-D scan. For a 3D reconstruction of an indoor scene, our method takes as…
Creating realistic styled spaces is a complex task, which involves design know-how for what furniture pieces go well together. Interior style follows abstract rules involving color, geometry and other visual elements. Following such rules,…
This paper presents an algorithm for indoor layout estimation and reconstruction through the fusion of a sequence of captured images and LiDAR data sets. In the proposed system, a movable platform collects both intensity images and 2D LiDAR…
Furnishing and rendering indoor scenes has been a long-standing task for interior design, where artists create a conceptual design for the space, build a 3D model of the space, decorate, and then perform rendering. Although the task is…
With the explosive growth of web-based cameras and mobile devices, billions of photographs are uploaded to the internet. We can trivially collect a huge number of photo streams for various goals, such as 3D scene reconstruction and other…
With the explosive growth of web-based cameras and mobile devices, billions of photographs are uploaded to the internet. We can trivially collect a huge number of photo streams for various goals, such as image clustering, 3D scene…
One major goal of vision is to infer physical models of objects, surfaces, and their layout from sensors. In this paper, we aim to interpret indoor scenes from one RGBD image. Our representation encodes the layout of orthogonal walls and…
The goal of this paper is to take a single 2D image of a scene and recover the 3D structure in terms of a small set of factors: a layout representing the enclosing surfaces as well as a set of objects represented in terms of shape and pose.…
Semantic reconstruction of indoor scenes refers to both scene understanding and object reconstruction. Existing works either address one part of this problem or focus on independent objects. In this paper, we bridge the gap between…
Discovering 3D arrangements of objects from single indoor images is important given its many applications including interior design, content creation, etc. Although heavily researched in the recent years, existing approaches break down…
Object recognition has seen significant progress in the image domain, with focus primarily on 2D perception. We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an…
One major goal of vision is to infer physical models of objects, surfaces, and their layout from sensors. In this paper, we aim to interpret indoor scenes from one RGBD image. Our representation encodes the layout of walls, which must…
We address the problem of detecting and erasing furniture from a wide angle photograph of a room. Inpainting large regions of an indoor scene often results in geometric inconsistencies of background elements within the inpaint mask. To…
We propose a novel method to accurately reconstruct a set of images representing a single scene from few linear multi-view measurements. Each observed image is modeled as the sum of a background image and a foreground one. The background…
Image generation models trained on large datasets can synthesize high-quality images but often produce spatially inconsistent and distorted images due to limited information about the underlying structures and spatial layouts. In this work,…
Panorama images have a much larger field-of-view thus naturally encode enriched scene context information compared to standard perspective images, which however is not well exploited in the previous scene understanding methods. In this…