Abstract

Japanese-language short version of PL-DPO-NLL, the Plackett–Luce preference learning framework with NLL regularization for safety alignment of vision–language–action driving policies. Presented as poster 4Yin-A-08 at JSAI 2026 in Gunma, Japan.

Accepted at JSAI 2026, Gunma, Japan.