Value Alignment Verification
Authors: Daniel S Brown, Jordan Schneider, Anca Dragan, Scott Niekum
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We analyze verification of exact value alignment for rational agents and propose and analyze heuristic and approximate value alignment verification tests in a wide range of gridworlds and a continuous autonomous driving domain. Finally, in Section 4.5 we study the most general setting of implicit human, implicit robot. We propose an algorithm for approximate value alignment verification in continuous state and action spaces and provide empirical results in a continuous autonomous driving domain where the human can only query the robot for preferences over trajectories. We now study the empirical performance of value alignment verification tests, first in the explicit human setting and then in the implicit human setting. Figure 4b displays the results of the synthetic human experiments. |
| Researcher Affiliation | Academia | 1University of California, Berkeley, USA 2University of Texas at Austin, USA. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code and videos are available at https://sites.google.com/view/icml-vav. |
| Open Datasets | Yes | We also analyze verification of exact value alignment for rational agents and propose and analyze heuristic and approximate value alignment verification tests in a wide range of gridworlds and a continuous autonomous driving domain. We next analyze approximate value alignment verification in the continuous autonomous driving domain from Sadigh et al. (2017). |
| Dataset Splits | No | The paper describes testing accuracy, but does not specify explicit train/validation/test splits with percentages or sample counts in the main text. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names with versions) needed to replicate the experiments. |
| Experiment Setup | No | The paper does not provide specific experimental setup details such as concrete hyperparameter values, training configurations, or system-level settings in the main text. It mentions to "See Appendix G for experimental parameters and details of the testing reward generation protocol" but these details are not in the main paper. |