Related papers: Geometry-Aware Neural Rendering

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

Recently, channel attention mechanism has demonstrated to offer great potential in improving the performance of deep convolutional neural networks (CNNs). However, most existing methods dedicate to developing more sophisticated attention…

Computer Vision and Pattern Recognition · Computer Science 2020-04-08 Qilong Wang , Banggu Wu , Pengfei Zhu , Peihua Li , Wangmeng Zuo , Qinghua Hu

Geometry-Aware Recurrent Neural Networks for Active Visual Recognition

We present recurrent geometry-aware neural networks that integrate visual information across multiple views of a scene into 3D latent feature tensors, while maintaining an one-to-one mapping between 3D physical locations in the world scene…

Computer Vision and Pattern Recognition · Computer Science 2018-11-15 Ricson Cheng , Ziyan Wang , Katerina Fragkiadaki

Efficient Multi-Scale Attention Module with Cross-Spatial Learning

Remarkable effectiveness of the channel or spatial attention mechanisms for producing more discernible feature representation are illustrated in various computer vision tasks. However, modeling the cross-channel relationships with channel…

Computer Vision and Pattern Recognition · Computer Science 2023-06-07 Daliang Ouyang , Su He , Guozhong Zhang , Mingzhu Luo , Huaiyong Guo , Jian Zhan , Zhijie Huang

DEL: Discrete Element Learner for Learning 3D Particle Dynamics with Neural Rendering

Learning-based simulators show great potential for simulating particle dynamics when 3D groundtruth is available, but per-particle correspondences are not always accessible. The development of neural rendering presents a new solution to…

Computer Vision and Pattern Recognition · Computer Science 2024-10-14 Jiaxu Wang , Jingkai Sun , Junhao He , Ziyi Zhang , Qiang Zhang , Mingyuan Sun , Renjing Xu

Aether: Geometric-Aware Unified World Modeling

The integration of geometric reconstruction and generative modeling remains a critical challenge in developing AI systems capable of human-like spatial reasoning. This paper proposes Aether, a unified framework that enables geometry-aware…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Aether Team , Haoyi Zhu , Yifan Wang , Jianjun Zhou , Wenzheng Chang , Yang Zhou , Zizun Li , Junyi Chen , Chunhua Shen , Jiangmiao Pang , Tong He

Learning to Reconstruct and Segment 3D Objects

To endow machines with the ability to perceive the real-world in a three dimensional representation as we do as humans is a fundamental and long-standing topic in Artificial Intelligence. Given different types of visual inputs such as…

Computer Vision and Pattern Recognition · Computer Science 2020-10-20 Bo Yang

Graph External Attention Enhanced Transformer

The Transformer architecture has recently gained considerable attention in the field of graph representation learning, as it naturally overcomes several limitations of Graph Neural Networks (GNNs) with customized attention mechanisms or…

Machine Learning · Computer Science 2025-04-01 Jianqing Liang , Min Chen , Jiye Liang

Equivariant Neural Rendering

We propose a framework for learning neural scene representations directly from images, without 3D supervision. Our key insight is that 3D structure can be imposed by ensuring that the learned representation transforms like a real 3D scene.…

Computer Vision and Pattern Recognition · Computer Science 2020-12-22 Emilien Dupont , Miguel Angel Bautista , Alex Colburn , Aditya Sankar , Carlos Guestrin , Josh Susskind , Qi Shan

FwNet-ECA: A Classification Model Enhancing Window Attention with Global Receptive Fields via Fourier Filtering Operations

Windowed attention mechanisms were introduced to mitigate the issue of excessive computation inherent in global attention mechanisms. In this paper, we present FwNet-ECA, a novel method that utilizes Fourier transforms paired with learnable…

Computer Vision and Pattern Recognition · Computer Science 2025-03-05 Shengtian Mian , Ya Wang , Nannan Gu , Yuping Wang , Xiaoqing Li

On Geometric Understanding and Learned Priors in Feed-forward 3D Reconstruction Models

Feed-forward 3D reconstruction models such as DUSt3R, VGGT, and Depth Anything 3 (DA3) are transformer-based foundation models that infer camera geometry and dense scene structure in a single forward pass. Trained at scale in a supervised…

Computer Vision and Pattern Recognition · Computer Science 2026-03-18 Jelena Bratulić , Sudhanshu Mittal , Thomas Brox , Christian Rupprecht

Learning Spatial Common Sense with Geometry-Aware Recurrent Networks

We integrate two powerful ideas, geometry and deep visual representation learning, into recurrent network architectures for mobile visual scene understanding. The proposed networks learn to "lift" and integrate 2D visual features over time…

Computer Vision and Pattern Recognition · Computer Science 2019-04-10 Hsiao-Yu Fish Tung , Ricson Cheng , Katerina Fragkiadaki

GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers

As transformers are equivariant to the permutation of input tokens, encoding the positional information of tokens is necessary for many tasks. However, since existing positional encoding schemes have been initially designed for NLP tasks,…

Computer Vision and Pattern Recognition · Computer Science 2024-06-10 Takeru Miyato , Bernhard Jaeger , Max Welling , Andreas Geiger

Curve Your Attention: Mixed-Curvature Transformers for Graph Representation Learning

Real-world graphs naturally exhibit hierarchical or cyclical structures that are unfit for the typical Euclidean space. While there exist graph neural networks that leverage hyperbolic or spherical spaces to learn representations that embed…

Machine Learning · Computer Science 2023-09-11 Sungjun Cho , Seunghyuk Cho , Sungwoo Park , Hankook Lee , Honglak Lee , Moontae Lee

ACORN: Adaptive Coordinate Networks for Neural Scene Representation

Neural representations have emerged as a new paradigm for applications in rendering, imaging, geometric modeling, and simulation. Compared to traditional representations such as meshes, point clouds, or volumes they can be flexibly…

Computer Vision and Pattern Recognition · Computer Science 2021-05-07 Julien N. P. Martel , David B. Lindell , Connor Z. Lin , Eric R. Chan , Marco Monteiro , Gordon Wetzstein

Recurrent 3D Attentional Networks for End-to-End Active Object Recognition

Active vision is inherently attention-driven: The agent actively selects views to attend in order to fast achieve the vision task while improving its internal representation of the scene being observed. Inspired by the recent success of…

Computer Vision and Pattern Recognition · Computer Science 2022-01-12 Min Liu , Yifei Shi , Lintao Zheng , Kai Xu , Hui Huang , Dinesh Manocha

A Comparative Evaluation of Geometric Accuracy in NeRF and Gaussian Splatting

Recent advances in neural rendering have introduced numerous 3D scene representations. Although standard computer vision metrics evaluate the visual quality of generated images, they often overlook the fidelity of surface geometry. This…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Mikolaj Zielinski , Eryk Vykysaly , Bartlomiej Biesiada , Jan Baturo , Mateusz Capala , Dominik Belter

3D Scene Reconstruction with Multi-layer Depth and Epipolar Transformers

We tackle the problem of automatically reconstructing a complete 3D model of a scene from a single RGB image. This challenging task requires inferring the shape of both visible and occluded surfaces. Our approach utilizes viewer-centered,…

Computer Vision and Pattern Recognition · Computer Science 2019-08-28 Daeyun Shin , Zhile Ren , Erik B. Sudderth , Charless C. Fowlkes

SPREAD: Spatial-Physical REasoning via geometry Aware Diffusion

Automated 3D scene generation is pivotal for applications spanning virtual reality, digital content creation, and Embodied AI. While computer graphics prioritizes aesthetic layouts, vision and robotics demand scenes that mirror real-world…

Graphics · Computer Science 2026-03-31 Minzhang Li , Kuixiang Shao , Xuebing Li , Yuyang Jiao , Yinuo Bai , Hengan Zhou , Sixian Shen , Jiayuan Gu , Jingyi Yu

Correlational Neural Networks

Common Representation Learning (CRL), wherein different descriptions (or views) of the data are embedded in a common subspace, is receiving a lot of attention recently. Two popular paradigms here are Canonical Correlation Analysis (CCA)…

Computation and Language · Computer Science 2015-10-13 Sarath Chandar , Mitesh M. Khapra , Hugo Larochelle , Balaraman Ravindran

Equivariant Graph Attention Networks for Molecular Property Prediction

Learning and reasoning about 3D molecular structures with varying size is an emerging and important challenge in machine learning and especially in drug discovery. Equivariant Graph Neural Networks (GNNs) can simultaneously leverage the…

Machine Learning · Computer Science 2022-03-03 Tuan Le , Frank Noé , Djork-Arné Clevert