reproducibilityindex.ai

Measuring Goal-Directedness

Authors: Matt MacDermott, James Fox, Francesco Belardinelli, Tom Everitt

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We prove that MEG satisfies several desiderata and demonstrate our algorithms with small-scale experiments 1. and We carried out two experiments5 to measure known-utility MEG with respect to the environment reward function and unknown-utility MEG with respect to a hypothesis class of utility functions.
Researcher Affiliation	Collaboration	Matt Mac Dermott Imperial College London James Fox University of Oxford London Initiative for Safe AI Francesco Belardinelli Imperial College London Tom Everitt Google DeepMind
Pseudocode	Yes	Algorithm 1 Known-utility MEG in MDPs and Algorithm 2 Unknown-utility MEG in MDPs
Open Source Code	Yes	5Code available at https://github.com/mattmacdermott1/measuring-goal-directedness
Open Datasets	Yes	Our experiments measured MEG for various policies in the Cliff World environment from the seals suite [Gleave et al., 2020].
Dataset Splits	No	The paper does not provide specific dataset split information for training, validation, or testing.
Hardware Specification	Yes	Hardware model: LENOVO20N2000RUK Processor: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz, 2112 Mhz, 4 Core(s), 8 Logical Processor(s) Memory: 24.0 GB
Software Dependencies	No	The paper mentions using ‘SEALS library’ and ‘imitation library’ but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We used an MLP with a single hidden layer of size 256 to define a utility function over states. and considering ε-greedy policies for ε in the range 0.1 to 0.9.