ReconDreamer

ReconDreamer:Crafting World Models for Driving Scene Reconstruction via Online Restoration

Chaojun Ni^*^1,2
Guosheng Zhao^*^1,4
Xiaofeng Wang^*^1,4
Zheng Zhu^*¹✉
Wenkang Qin¹
Guan Huang¹

Chen Liu³
Yuyin Chen³
Yida Wang³
Xueyang Zhang³
Yifei Zhan³
Kun Zhan³
Peng Jia³
Xianpeng Lang³

Xingang Wang⁴
Wenjun Mei²✉

¹ GigaAI
² Peking University
³ Li Auto Inc.
⁴ CASIA

Dynamic driving scene reconstruction methods, such as DriveDreamer4D and Street Gaussians, encounter significant challenges when rendering larger maneuvers (e.g., multi-lane shifts). In contrast, the proposed ReconDreamer significantly improves rendering quality via incrementally integrating world model knowledge.

The overall framework of ReconDreamer. During the training of the dynamic driving scene reconstruction, we begin by rendering novel trajectory views. These rendered videos are subsequently processed by the DriveRestorer to restore their quality. Then these restored videos, together with the original video, are employed to optimize the reconstruction model. This iterative process continues until the reconstruction model converges.

Abstract

Closed-loop simulation is crucial for end-to-end autonomous driving. Existing sensor simulation methods (e.g., NeRF and 3DGS) reconstruct driving scenes based on conditions that closely mirror training data distributions. However, these methods struggle with rendering novel trajectories, such as lane changes. Recent works have demonstrated that integrating world model knowledge alleviates these issues. Despite their efficiency, these approaches still encounter difficulties in the accurate representation of more complex maneuvers, with multi-lane shifts being a notable example. Therefore, we introduce ReconDreamer, which enhances driving scene reconstruction through incremental integration of world model knowledge. Specifically, DriveRestorer is proposed to mitigate artifacts via online restoration. This is complemented by a progressive data update strategy designed to ensure high-quality rendering for more complex maneuvers. To the best of our knowledge, ReconDreamer is the first method to effectively render in large maneuvers. Experimental results demonstrate that ReconDreamer outperforms Street Gaussians in the NTA-IoU, NTL-IoU, and FID, with relative improvements by 24.87%, 6.72%, and 29.97%. Furthermore,ReconDreamer surpasses DriveDreamer4D with PVG during large maneuver rendering, as verified by a relative improvement of 195.87% in the NTA-IoU metric and a comprehensive user study.

Comparisons

Lane Shift 3m

Lane Shift 6m

Lane Change

Original Trajectory

Novel Trajectory

Any novel trajectory perspective rendering

Original Trajectory

Novel Trajectory

Comparison of the GT Video from the original trajectory and the videos rendered under new trajectories. The left column shows the GT Video from the original trajectory, while the right column shows the videos rendered under new trajectories.

Visualizing Object and Lane Detection

Object Detection Comparison

Lane Detection Comparison

Degraded videos rendered under new trajectories and their restored videos from DriveRestorer

Comparisons of degraded videos rendered under new trajectories and their restored videos from DriveRestorer. The left column shows degraded videos rendered under new trajectories, while the right column shows their restored videos.