Related papers: Efficient 3D Object Reconstruction using Visual Tr…

3D Vision with Transformers: A Survey

The success of the transformer architecture in natural language processing has recently triggered attention in the computer vision field. The transformer has been used as a replacement for the widely used convolution operators, due to its…

Computer Vision and Pattern Recognition · Computer Science 2022-08-09 Jean Lahoud , Jiale Cao , Fahad Shahbaz Khan , Hisham Cholakkal , Rao Muhammad Anwer , Salman Khan , Ming-Hsuan Yang

Deep Models for Multi-View 3D Object Recognition: A Review

Human decision-making often relies on visual information from multiple perspectives or views. In contrast, machine learning-based object recognition utilizes information from a single image of the object. However, the information conveyed…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Mona Alzahrani , Muhammad Usman , Salma Kammoun , Saeed Anwar , Tarek Helmy

Image-based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era

3D reconstruction is a longstanding ill-posed problem, which has been explored for decades by the computer vision, computer graphics, and machine learning communities. Since 2015, image-based 3D reconstruction using convolutional neural…

Computer Vision and Pattern Recognition · Computer Science 2019-11-28 Xian-Feng Han , Hamid Laga , Mohammed Bennamoun

3D-RETR: End-to-End Single and Multi-View 3D Reconstruction with Transformers

3D reconstruction aims to reconstruct 3D objects from 2D views. Previous works for 3D reconstruction mainly focus on feature matching between views or using CNNs as backbones. Recently, Transformers have been shown effective in multiple…

Computer Vision and Pattern Recognition · Computer Science 2021-11-17 Zai Shi , Zhao Meng , Yiran Xing , Yunpu Ma , Roger Wattenhofer

HORT: Monocular Hand-held Objects Reconstruction with Transformers

Reconstructing hand-held objects in 3D from monocular images remains a significant challenge in computer vision. Most existing approaches rely on implicit 3D representations, which produce overly smooth reconstructions and are…

Computer Vision and Pattern Recognition · Computer Science 2025-07-22 Zerui Chen , Rolandos Alexandros Potamias , Shizhe Chen , Cordelia Schmid

Sketch2CAD: 3D CAD Model Reconstruction from 2D Sketch using Visual Transformer

Current 3D reconstruction methods typically generate outputs in the form of voxels, point clouds, or meshes. However, each of these formats has inherent limitations, such as rough surfaces and distorted structures. Additionally, these data…

Computer Vision and Pattern Recognition · Computer Science 2025-02-21 Hong-Bin Yang

Efficient Algorithms for Convolutional Inverse Problems in Multidimensional Imaging

Multidimensional imaging, capturing image data in more than two dimensions, has been an emerging field with diverse applications. Due to the limitation of two-dimensional detectors in obtaining the high-dimensional image data, computational…

Image and Video Processing · Electrical Eng. & Systems 2020-06-16 Didem Dogan , Figen S. Oktem

Active Object Reconstruction Using a Guided View Planner

Inspired by the recent advance of image-based object reconstruction using deep learning, we present an active reconstruction model using a guided view planner. We aim to reconstruct a 3D model using images observed from a planned sequence…

Computer Vision and Pattern Recognition · Computer Science 2018-05-09 Xin Yang , Yuanbo Wang , Yaru Wang , Baocai Yin , Qiang Zhang , Xiaopeng Wei , Hongbo Fu

3 Dimensional Dense Reconstruction: A Review of Algorithms and Dataset

3D dense reconstruction refers to the process of obtaining the complete shape and texture features of 3D objects from 2D planar images. 3D reconstruction is an important and extensively studied problem, but it is far from being solved. This…

Computer Vision and Pattern Recognition · Computer Science 2023-04-20 Yangming Li

Evaluating Vision Transformer Methods for Deep Reinforcement Learning from Pixels

Vision Transformers (ViT) have recently demonstrated the significant potential of transformer architectures for computer vision. To what extent can image-based deep reinforcement learning also benefit from ViT architectures, as compared to…

Machine Learning · Computer Science 2022-05-17 Tianxin Tao , Daniele Reda , Michiel van de Panne

Making a Case for 3D Convolutions for Object Segmentation in Videos

The task of object segmentation in videos is usually accomplished by processing appearance and motion information separately using standard 2D convolutional networks, followed by a learned fusion of the two sources of information. On the…

Computer Vision and Pattern Recognition · Computer Science 2023-09-04 Sabarinath Mahadevan , Ali Athar , Aljoša Ošep , Sebastian Hennen , Laura Leal-Taixé , Bastian Leibe

Transformers in Unsupervised Structure-from-Motion

Transformers have revolutionized deep learning based computer vision with improved performance as well as robustness to natural corruptions and adversarial attacks. Transformers are used predominantly for 2D vision tasks, including image…

Computer Vision and Pattern Recognition · Computer Science 2023-12-19 Hemang Chawla , Arnav Varma , Elahe Arani , Bahram Zonooz

3D Scene Reconstruction with Multi-layer Depth and Epipolar Transformers

We tackle the problem of automatically reconstructing a complete 3D model of a scene from a single RGB image. This challenging task requires inferring the shape of both visible and occluded surfaces. Our approach utilizes viewer-centered,…

Computer Vision and Pattern Recognition · Computer Science 2019-08-28 Daeyun Shin , Zhile Ren , Erik B. Sudderth , Charless C. Fowlkes

What Do Single-view 3D Reconstruction Networks Learn?

Convolutional networks for single-view object reconstruction have shown impressive performance and have become a popular subject of research. All existing techniques are united by the idea of having an encoder-decoder network that performs…

Computer Vision and Pattern Recognition · Computer Science 2019-05-10 Maxim Tatarchenko , Stephan R. Richter , René Ranftl , Zhuwen Li , Vladlen Koltun , Thomas Brox

Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction

Conventional methods of 3D object generative modeling learn volumetric predictions using deep networks with 3D convolutional operations, which are direct analogies to classical 2D ones. However, these methods are computationally wasteful in…

Computer Vision and Pattern Recognition · Computer Science 2017-06-22 Chen-Hsuan Lin , Chen Kong , Simon Lucey

Next-best-view Regression using a 3D Convolutional Neural Network

Automated three-dimensional (3D) object reconstruction is the task of building a geometric representation of a physical object by means of sensing its surface. Even though new single view reconstruction techniques can predict the surface,…

Computer Vision and Pattern Recognition · Computer Science 2021-01-27 J. Irving Vasquez-Gomez , David Troncoso , Israel Becerra , Enrique Sucar , Rafael Murrieta-Cid

A Simple and Scalable Shape Representation for 3D Reconstruction

Deep learning applied to the reconstruction of 3D shapes has seen growing interest. A popular approach to 3D reconstruction and generation in recent years has been the CNN encoder-decoder model usually applied in voxel space. However, this…

Computer Vision and Pattern Recognition · Computer Science 2020-05-12 Mateusz Michalkiewicz , Eugene Belilovsky , Mahsa Baktashmotlagh , Anders Eriksson

Research on Image Super-Resolution Reconstruction Mechanism based on Convolutional Neural Network

Super-resolution reconstruction techniques entail the utilization of software algorithms to transform one or more sets of low-resolution images captured from the same scene into high-resolution images. In recent years, considerable…

Computer Vision and Pattern Recognition · Computer Science 2024-08-02 Hao Yan , Zixiang Wang , Zhengjia Xu , Zhuoyue Wang , Zhizhong Wu , Ranran Lyu

Image Reconstruction as a Tool for Feature Analysis

Vision encoders are increasingly used in modern applications, from vision-only models to multimodal systems such as vision-language models. Despite their remarkable success, it remains unclear how these architectures represent features…

Computer Vision and Pattern Recognition · Computer Science 2025-06-10 Eduard Allakhverdov , Dmitrii Tarasov , Elizaveta Goncharova , Andrey Kuznetsov

Deep Single-View 3D Object Reconstruction with Visual Hull Embedding

3D object reconstruction is a fundamental task of many robotics and AI problems. With the aid of deep convolutional neural networks (CNNs), 3D object reconstruction has witnessed a significant progress in recent years. However, possibly due…

Computer Vision and Pattern Recognition · Computer Science 2018-09-11 Hanqing Wang , Jiaolong Yang , Wei Liang , Xin Tong