reproducibilityindex.ai

Maximum State Entropy Exploration using Predecessor and Successor Representations

Authors: Arnav Kumar Jain, Lucas Lehnert, Irina Rish, Glen Berseth

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 5 we demonstrate through empirical experiments that ηψ-Learning achieves optimal coverage within a single finite-length trajectory. Moreover, the visualizations presented in Section 5 demonstrate that ηψ-Learning learns an exploration policy that maneuvers through the state space to efficiently explore a task while minimizing the number of times the same state is revisited.
Researcher Affiliation	Collaboration	Arnav Kumar Jain Mila Quebec AI Institute Université de Montréal Lucas Lehnert Fundamental AI Research at Meta Irina Rish Mila Quebec AI Institute Université de Montréal Glen Berseth Mila Quebec AI Institute Université de Montréal
Pseudocode	Yes	Algorithm 1 ηψ-Learning: Dynamic Programming Framework
Open Source Code	Yes	An implementation of the ηψ-Learning algorithm together with instructions for reproducing the experiments presented in this paper can be found at https://github.com/arnavkj1995/Eta_Psi_Learning.
Open Datasets	Yes	The Chain MDP and River Swim [58] is a six-state chain where the transitions are deterministic or stochastic, respectively.
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits. It mentions training parameters like 'Length of trajectory from environment' and 'Number of episodes' and evaluation parameters for metrics, but no formal data splitting strategy is described.
Hardware Specification	Yes	All models were trained on a single NVIDIA V100 GPU with 32 GB memory.
Software Dependencies	No	The paper mentions software components like 'RLHive [46] library' and 'Dreamer V2 [20]' and 'Adam [29] Optimizer', but does not provide specific version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup	Yes	In Appendix G, 'Hyper Parameters' (Table 3 and Table 4) explicitly list specific values for various experimental setup details, including 'Batch Size', 'Sequence Length', 'α for γ-function', 'Encoder layers', 'Learning rate', 'Optimizer', and many more.