reproducibilityindex.ai

Value Alignment Verification

Authors: Daniel S Brown, Jordan Schneider, Anca Dragan, Scott Niekum

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We analyze veriﬁcation of exact value alignment for rational agents and propose and analyze heuristic and approximate value alignment veriﬁcation tests in a wide range of gridworlds and a continuous autonomous driving domain. Finally, in Section 4.5 we study the most general setting of implicit human, implicit robot. We propose an algorithm for approximate value alignment veriﬁcation in continuous state and action spaces and provide empirical results in a continuous autonomous driving domain where the human can only query the robot for preferences over trajectories. We now study the empirical performance of value alignment veriﬁcation tests, ﬁrst in the explicit human setting and then in the implicit human setting. Figure 4b displays the results of the synthetic human experiments.
Researcher Affiliation	Academia	1University of California, Berkeley, USA 2University of Texas at Austin, USA.
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Source code and videos are available at https://sites.google.com/view/icml-vav.
Open Datasets	Yes	We also analyze veriﬁcation of exact value alignment for rational agents and propose and analyze heuristic and approximate value alignment veriﬁcation tests in a wide range of gridworlds and a continuous autonomous driving domain. We next analyze approximate value alignment veriﬁcation in the continuous autonomous driving domain from Sadigh et al. (2017).
Dataset Splits	No	The paper describes testing accuracy, but does not specify explicit train/validation/test splits with percentages or sample counts in the main text.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library names with versions) needed to replicate the experiments.
Experiment Setup	No	The paper does not provide specific experimental setup details such as concrete hyperparameter values, training configurations, or system-level settings in the main text. It mentions to "See Appendix G for experimental parameters and details of the testing reward generation protocol" but these details are not in the main paper.