正則化付き多重選好学習による自動運転 VLA モデルの安全制約アライメント
JSAI 2026 (人工知能学会全国大会), 2026
Abstract
Japanese-language short version of PL-DPO-NLL, the Plackett–Luce preference learning framework with NLL regularization for safety alignment of vision–language–action driving policies. Presented as poster 4Yin-A-08 at JSAI 2026 in Gunma, Japan.
Accepted at JSAI 2026, Gunma, Japan.