Alexandre Alahi — Scifaro

Deformable Gaussian Occupancy: Decoupling Rigid and Nonrigid Motion with Factorized Distillation

Understanding dynamic 3D environments is essential for safe autonomous driving, particularly when reasoning about human-centric, nonrigid agents. However, existing weakly supervised occupancy prediction frameworks predominantly assume…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Yang Gao , Wuyang Li , Po-Chien Luan , Alexandre Alahi

Proprio: Latent Self-Scoring and Inference-Time Refinement for Physically Plausible Video Generation

Modern video generative models produce visually impressive results, yet frequently violate basic physical principles. We propose Proprio, a training-free framework that enables a frozen video generator to assess and improve the physical…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Mariam Hassan , Kaouther Messaoud , Wuyang Li , Alexandre Alahi

Drift-Resistant Navigation World Model with Anchored Epipolar Guidance

We propose Drift-Resistant Navigation World Model, a generative model that mitigates both perceptual drift and geometric drift in conventional rollout-based navigation world models. Existing methods recursively feed generated content into…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Po-Chien Luan , Zimin Xia , Wuyang Li , Yang Gao , Alexandre Alahi

Social-Mamba: Socially-Aware Trajectory Forecasting with State-Space Models

Human trajectory forecasting is crucial for safe navigation in crowded environments, requiring models that balance accuracy with computational efficiency. Efficiently modeling social interactions is key to performance in dense crowds. Yet,…

Computer Vision and Pattern Recognition · Computer Science 2026-05-18 Po-Chien Luan , Wuyang Li , Yang Gao , Alexandre Alahi

EverAnimate: Minute-Scale Human Animation via Latent Flow Restoration

We propose EverAnimate, an efficient post-training method for long-horizon animated video generation that preserves visual quality and character identity. Long-form animation remains challenging because highly dynamic human motion must be…

Computer Vision and Pattern Recognition · Computer Science 2026-05-15 Wuyang Li , Yang Gao , Mariam Hassan , Lan Feng , Wentao Pan , Po-Chien Luan , Alexandre Alahi

Position: Mind the Gap-AI Security and the Limits of Current Reporting Standards

AI systems face a growing number of AI security threats that are increasingly exploited in the real world. Hence, shared AI incident reporting practices are emerging in industry as best practice and as mandated by regulatory requirements.…

Cryptography and Security · Computer Science 2026-05-06 Lukas Bieringer , Sean McGregor , Nicole Nichols , Kevin Paeth , Jochen Stängler , Andreas Wespi , Alexandre Alahi , Kathrin Grosse

Grounded World Model for Semantically Generalizable Planning

In Model Predictive Control (MPC), world models predict the future outcomes of various action proposals, which are then scored to guide the selection of the optimal action. For visuomotor MPC, the score function is a distance metric between…

Robotics · Computer Science 2026-04-14 Quanyi Li , Lan Feng , Haonan Zhang , Wuyang Li , Letian Wang , Alexandre Alahi , Harold Soh

PoseDriver: A Unified Approach to Multi-Category Skeleton Detection for Autonomous Driving

Object skeletons offer a concise representation of structural information, capturing essential aspects of posture and orientation that are crucial for autonomous driving applications. However, a unified architecture that simultaneously…

Computer Vision and Pattern Recognition · Computer Science 2026-03-26 Yasamin Borhani , Taylor Mordan , Yihan Wang , Reyhaneh Hosseininejad , Javad Khoramdel , Alexandre Alahi

Anchored Video Generation: Decoupling Scene Construction and Temporal Synthesis in Text-to-Video Diffusion Models

State-of-the-art Text-to-Video (T2V) diffusion models can generate visually impressive results, yet they still frequently fail to compose complex scenes or follow logical temporal instructions. In this paper, we argue that many errors,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-26 Mariam Hassan , Bastien Van Delft , Wuyang Li , Alexandre Alahi

SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching

Diffusion models achieve state-of-the-art video generation quality, but their inference remains expensive due to the large number of sequential denoising steps. This has motivated a growing line of research on accelerating diffusion…

Computer Vision and Pattern Recognition · Computer Science 2026-03-02 Yasaman Haghighi , Alexandre Alahi

Loc$^2$: Interpretable Cross-View Localization via Depth-Lifted Local Feature Matching

We propose an accurate and interpretable fine-grained cross-view localization method that estimates the 3 Degrees of Freedom (DoF) pose of a ground-level image by matching its local features with a reference aerial image. Unlike prior…

Computer Vision and Pattern Recognition · Computer Science 2026-02-27 Zimin Xia , Chenghao Xu , Alexandre Alahi

Communication-Inspired Tokenization for Structured Image Representations

Discrete image tokenizers have emerged as a key component of modern vision and multimodal systems, providing a sequential interface for transformer-based architectures. However, most existing approaches remain primarily optimized for…

Computer Vision and Pattern Recognition · Computer Science 2026-02-25 Aram Davtyan , Yusuf Sahin , Yasaman Haghighi , Sebastian Stapf , Pablo Acuaviva , Alexandre Alahi , Paolo Favaro

LayerSync: Self-aligning Intermediate Layers

We propose LayerSync, a domain-agnostic approach for improving the generation quality and the training efficiency of diffusion models. Prior studies have highlighted the connection between the quality of generation and the representations…

Computer Vision and Pattern Recognition · Computer Science 2026-02-20 Yasaman Haghighi , Bastien van Delft , Mariam Hassan , Alexandre Alahi

RAP: 3D Rasterization Augmented End-to-End Planning

Imitation learning for end-to-end driving trains policies only on expert demonstrations. Once deployed in a closed loop, such policies lack recovery data: small mistakes cannot be corrected and quickly compound into failures. A promising…

Computer Vision and Pattern Recognition · Computer Science 2026-02-10 Lan Feng , Yang Gao , Eloi Zablocki , Quanyi Li , Wuyang Li , Sichao Liu , Matthieu Cord , Alexandre Alahi

JointDiff: Bridging Continuous and Discrete in Multi-Agent Trajectory Generation

Generative models often treat continuous data and discrete events as separate processes, creating a gap in modeling complex systems where they interact synchronously. To bridge this gap, we introduce JointDiff, a novel diffusion framework…

Machine Learning · Computer Science 2026-01-30 Guillem Capellera , Luis Ferraz , Antonio Rubio , Alexandre Alahi , Antonio Agudo

COARSE: Collaborative Pseudo-Labeling with Coarse Real Labels for Off-Road Semantic Segmentation

Autonomous off-road navigation faces challenges due to diverse, unstructured environments, requiring robust perception with both geometric and semantic understanding. However, scarce densely labeled semantic data limits generalization…

Computer Vision and Pattern Recognition · Computer Science 2026-01-27 Aurelio Noca , Xianmei Lei , Jonathan Becktor , Jeffrey Edlund , Anna Sabel , Patrick Spieler , Curtis Padgett , Alexandre Alahi , Deegan Atha

MAD: Motion Appearance Decoupling for efficient Driving World Models

Recent video diffusion models generate photorealistic, temporally coherent videos, yet they fall short as reliable world models for autonomous driving, where structured motion and physically consistent interactions are essential. Adapting…

Computer Vision and Pattern Recognition · Computer Science 2026-01-15 Ahmad Rahimi , Valentin Gerard , Eloi Zablocki , Matthieu Cord , Alexandre Alahi

RUMPL: Ray-Based Transformers for Universal Multi-View 2D to 3D Human Pose Lifting

Estimating 3D human poses from 2D images remains challenging due to occlusions and projective ambiguity. Multi-view learning-based approaches mitigate these issues but often fail to generalize to real-world scenarios, as large-scale…

Computer Vision and Pattern Recognition · Computer Science 2025-12-18 Seyed Abolfazl Ghasemzadeh , Alexandre Alahi , Christophe De Vleeschouwer

EvoLM: In Search of Lost Language Model Training Dynamics

Modern language model (LM) training has been divided into multiple stages, making it difficult for downstream developers to evaluate the impact of design choices made at each stage. We present EvoLM, a model suite that enables systematic…

Computation and Language · Computer Science 2025-11-19 Zhenting Qi , Fan Nie , Alexandre Alahi , James Zou , Himabindu Lakkaraju , Yilun Du , Eric Xing , Sham Kakade , Hanlin Zhang

Strada-LLM: Graph LLM for traffic prediction

Traffic forecasting is pivotal for intelligent transportation systems, where accurate and interpretable predictions can significantly enhance operational efficiency and safety. A key challenge stems from the heterogeneity of traffic…

Machine Learning · Computer Science 2025-11-17 Seyed Mohamad Moghadas , Bruno Cornelis , Alexandre Alahi , Adrian Munteanu