RLFTSim: Realistic and Controllable Multi-Agent Traffic Simulation via Reinforcement Learning Fine-Tuning

March 29, 2026·

Ehsan ahmadi

Hunter schofield

Behzad khamidehi

Fazel arasteh

Jinjun shan

Lili mou

Dongfeng bai

Kasra rezaee

· 0 min read

Project Page

Abstract

Supervised open-loop training has been widely adopted for training traffic simulation models; however, it fails to capture the inherently dynamic, multi-agent interactions prevalent in complex driving scenarios. We introduce RLFTSim, a reinforcement-learning-based fine-tuning framework that enhances scenario realism by aligning simulator rollouts with real-world data distributions and provides a method for the distillation of goal-conditioned controllability in scenario generation. We instantiate RLFTSim atop a pre-trained simulator, design a reward that balances fidelity and controllability, and perform extensive experiments on the Waymo Open Motion Dataset. Our results show improvements in realism enhancement and achieve state-of-the-art performance. Compared with other heuristic search-based fine-tuning methods, RLFTSim requires significantly fewer samples due to a proposed low-variance and dense reward signal. We also showcase the effectiveness of our approach for controllability enhancement in traffic simulation via goal-conditioning.

Type

Conference paper

Publication

Accepted at IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)

Last updated on March 29, 2026