Rectifying Shortcut Behaviors in Preference-Based Reward Learning arxiv.org 1 points by PaulHoule 3 days ago