Position: Intent-aligned AI Systems Must Optimize for Agency Preservation
Authors: Catalin Mitelut, Benjamin Smith, Peter Vamplew
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Appendix E we provide simulations to show how elementary interactions with AI systems that do not penalize agency loss can result in decreasing agency or options of end users. ... We simulate an episode of 10,000 action selections by the human and compute the value that the observing AI agent would ascribe to each action at each time point using TDlearning (with learning rate of: 0.1; colored plot-lines in Fig 3). ... An average over ten independent episodes similarily yields an uneven recommendation distribution of 23%, 28%, 31% and 18% respectively for the actions25. |
| Researcher Affiliation | Academia | 1Forum Basiliense, University of Basel 2University of Oregon 3Federation University Australia. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | Code will be provided following the blind-review process. |
| Open Datasets | No | The paper uses conceptual simulations to demonstrate arguments rather than publicly available datasets with concrete access information. |
| Dataset Splits | No | The paper uses conceptual simulations and describes 'episodes' but does not specify formal training, validation, or test dataset splits in the manner typically found in empirical ML papers for reproducibility. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running simulations or experiments, only mentioning conceptual models. |
| Software Dependencies | No | The paper does not provide specific software dependencies or version numbers. |
| Experiment Setup | Yes | We simulate an episode of 10,000 action selections by the human and compute the value that the observing AI agent would ascribe to each action at each time point using TDlearning (with learning rate of: 0.1; colored plot-lines in Fig 3). ... We simulated such a paradigm using a hard-boundary for value depletion (e.g. preventing AI systems from nudging or decreasing the value of an option beyond a certain limit, here 0.9 x initial value) and show the somewhat trivial result that both action selection and valuation are better preserved in such scenarios (Fig 6b,c). |