doi.org
March 2023 • Andreea Bobu, Yi Liu, Rohin Shah, Daniel S. Brown, Anca D. Dragan
When robots learn reward functions using high capacity models that take raw\nstate directly as input, they need to both learn a representation for what\nmatters in the task -- the task ``features" -- as well as how to combine these\nfeatures into a single objective. If they try to do both at once from input\ndesigned to teach the full reward function, it is easy to end up with a\nrepresentation that contains spurious correlations in the data, which fails to\ngeneralize to new settings. Instead, our ultimate goal i…