Adaptive Interest for Emphatic Reinforcement Learning

Authors: Martin Klissarov, Rasool Fakoor, Jonas W. Mueller, Kavosh Asadi, Taesup Kim, Alexander J. Smola

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations on a wide range of environments show that adapting the interest is key to provide significant gains. Qualitative analysis indicates that the learned interest function emphasizes states of particular importance, such as bottlenecks, which can be especially useful in a transfer learning setting.
Researcher Affiliation Collaboration Martin Klissarov Mila, Mc Gill University Rasool Fakoor Amazon Web Services Jonas Mueller Cleanlab Kavosh Asadi Amazon Web Services Taesup Kim Seoul National University Alexander J. Smola Amazon Web Services
Pseudocode Yes Pseudocode for our approach, which we call MINT (Meta-gradient Interest), is presented in Algorithm 1 of App. A.
Open Source Code No The paper does not provide an explicit statement about releasing its source code for the described methodology or a direct link to a code repository.
Open Datasets Yes We adopt the setup from [16], who considered two variations of the classical Four Rooms domain... We verify the generality of the proposed method by considering the Min Atar domain [56]... We perform experiments on the Mu Jo Co domain [49, 7].
Dataset Splits No The paper does not provide specific details on training, validation, and test dataset splits (e.g., percentages or counts). It mentions using random seeds and timesteps, which are typical for RL training, but not dataset splitting for reproducibility in the supervised learning sense.
Hardware Specification No The paper does not specify the exact hardware (e.g., CPU, GPU models, memory) used for running the experiments. It mentions using 'GPU credit' in the acknowledgments but no specific models.
Software Dependencies No The paper mentions software components like 'OpenAI Gym', 'MuJoCo', 'MinAtar', and 'PPO agent' but does not provide specific version numbers for any of them.
Experiment Setup Yes All hyperparameters are available in the App. E.