PrefDrive: Enhancing Autonomous Driving through Preference-Guided Large Language Models
IEEE Intelligent Vehicles Symposium (IV), 2025
Abstract
First integration of Direct Preference Optimization (DPO) into LLM-based autonomous driving. A novel dataset of 74,040 sequences with driving preference annotations is collected, and memory-efficient fine-tuning (LoRA + 4-bit quantization) is performed on a single RTX 3090 Ti. On CARLA closed-loop, PrefDrive reduces traffic light violations by 28.1%, improves route completion by 8.5%, and reduces layout collisions by 63.5%.
Accepted, presented in Cluj-Napoca, Romania.