English

VectorWorld: Efficient Streaming World Model via Diffusion Flow on Vector Graphs

Robotics 2026-03-19 v1 Computer Vision and Pattern Recognition

Abstract

Closed-loop evaluation of autonomous-driving policies requires interactive simulation beyond log replay. However, existing generative world models often degrade in closed loop due to (i) history-free initialization that mismatches policy inputs, (ii) multi-step sampling latency that violates real-time budgets, and (iii) compounding kinematic infeasibility over long horizons. We propose VectorWorld, a streaming world model that incrementally generates ego-centric 64m×64m64 \mathrm{m}\times 64\mathrm{m} lane--agent vector-graph tiles during rollout. VectorWorld aligns initialization with history-conditioned policies by producing a policy-compatible interaction state via a motion-aware gated VAE. It enables real-time outpainting via solver-free one-step masked completion with an edge-gated relational DiT trained with interval-conditioned MeanFlow and JVP-based large-step supervision. To stabilize long-horizon rollouts, we introduce Δ\DeltaSim, a physics-aligned non-ego (NPC) policy with hybrid discrete--continuous actions and differentiable kinematic logit shaping. On Waymo open motion and nuPlan, VectorWorld improves map-structure fidelity and initialization validity, and supports stable, real-time 1km+1\mathrm{km}+ closed-loop rollouts (\href{https://github.com/jiangchaokang/VectorWorld}{code}).

Keywords

Cite

@article{arxiv.2603.17652,
  title  = {VectorWorld: Efficient Streaming World Model via Diffusion Flow on Vector Graphs},
  author = {Chaokang Jiang and Desen Zhou and Jiuming Liu and Kevin Li Sun},
  journal= {arXiv preprint arXiv:2603.17652},
  year   = {2026}
}

Comments

Under Review

R2 v1 2026-07-01T11:26:03.386Z