reproducibilityindex.ai

Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

Authors: Mhairi Dunion, Trevor McInroe, Kevin Luck, Josiah Hanna, Stefano Albrecht

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate experimentally, using continuous control tasks, that our approach improves generalisation under correlation shifts, as well as improving the training performance of RL algorithms in the presence of correlated features.
Researcher Affiliation	Academia	Mhairi Dunion University of Edinburgh mhairi.dunion@ed.ac.uk Trevor Mc Inroe University of Edinburgh t.mcinroe@ed.ac.uk Kevin Sebastian Luck Vrije Universiteit Amsterdam k.s.luck@vu.nl Josiah P. Hanna University of Wisconsin Madison jphanna@cs.wisc.edu Stefano V. Albrecht University of Edinburgh s.albrecht@ed.ac.uk
Pseudocode	Yes	The architecture for CMID is shown in Figure 2, and the pseudocode is provided in Algorithm 1.
Open Source Code	Yes	A public and open-source implementation of CMID is available at github.com/uoe-agents/cmid.
Open Datasets	Yes	We evaluate our approach on continuous control tasks with image observations from the Deep Mind Control Suite (DMC) (Tunyasuvunakool et al., 2020) where we add correlations between object colour and properties impacting dynamics (e.g. joint positions).
Dataset Splits	No	The paper mentions training and testing but does not explicitly describe a separate validation split for hyperparameter tuning.
Hardware Specification	Yes	For each experiment run we use a single NVIDIA Volta V100 GPU with 32GB memory and a single CPU.
Software Dependencies	No	The paper mentions PyTorch and the Captum library but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	Table 2: Hyperparameter values for both SVEA and SVEA-CMID provides detailed settings like Replay buffer capacity 100000, Batch size 128, Discount factor 0.99, Optimizer Adam, Learning rate (actor, critic and encoder) 1e-3, etc.