Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Should Robots be Obedient?
Authors: Smitha Milli, Dylan Hadfield-Menell, Anca Dragan, Stuart Russell
IJCAI 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 2: Autonomy advantage (left) and obedience O (right) over time. and All experiments in this paper use the following parameters unless otherwise noted. At the start of each episode θ N(0, I) and at each step φn(a) N(0, I). There are 10 actions, 10 features, and β = 2. 2 Finally, even with good approximations we may still have good reason for feeling hesitation about disobedient robots. |
| Researcher Affiliation | Academia | Smitha Milli, Dylan Hadfield-Menell, Anca Dragan, Stuart Russell University of California, Berkeley EMAIL |
| Pseudocode | No | No pseudocode or algorithm block is present in the paper. |
| Open Source Code | Yes | All experiments can be replicated using the Jupyter notebook available at http://github.com/smilli/obedience |
| Open Datasets | No | All experiments in this paper use the following parameters unless otherwise noted. At the start of each episode θ N(0, I) and at each step φn(a) N(0, I). There are 10 actions, 10 features, and β = 2. |
| Dataset Splits | No | The paper uses a “simpler repeated game” where “each state is independent of the next”, but no explicit training/validation/test splits, percentages, or sample counts are mentioned. |
| Hardware Specification | No | No specific hardware details (such as GPU or CPU models, memory, or cloud instances) are mentioned for the experimental setup. |
| Software Dependencies | No | The paper mentions a 'Jupyter notebook' but does not provide specific version numbers for software dependencies such as programming languages, libraries, or frameworks (e.g., Python version, PyTorch version). |
| Experiment Setup | Yes | All experiments in this paper use the following parameters unless otherwise noted. At the start of each episode θ N(0, I) and at each step φn(a) N(0, I). There are 10 actions, 10 features, and β = 2. |