reproducibilityindex.ai

ELDEN: Exploration via Local Dependencies

Authors: Zizhao Wang, Jiaheng Hu, Peter Stone, Roberto Martín-Martín

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the performance of ELDEN on four different domains with complex dependencies, ranging from 2D grid worlds to 3D robotic tasks. In all domains, ELDEN correctly identifies local dependencies and learns successful policies, significantly outperforming previous state-of-the-art exploration methods.
Researcher Affiliation	Collaboration	Zizhao Wang University of Texas at Austin zizhao.wang@utexas.edu Jiaheng Hu University of Texas at Austin jhu@cs.utexas.edu Peter Stone University of Texas at Austin, Sony AI pstone@cs.utexas.edu Roberto Martín-Martín University of Texas at Austin robertomm@cs.utexas.edu
Pseudocode	Yes	Algorithm 1 Training of ELDEN (on-policy)
Open Source Code	No	The paper does not contain any explicit statement or link indicating that the source code for ELDEN is publicly available.
Open Datasets	Yes	We evaluate ELDEN in four simulated environments with different objects that have complex and chained dependencies: (1) CARWASH, (2) THAWING, (4) 2D MINECRAFT and (3) KITCHEN. Both CARWASH and THAWING are long-horizon household tasks in discrete gridworld from the Mini-BEHAVIOR Benchmark [12]. MINECRAFT 2D is an environment modified from the one used by Andreas et al. [1], where the agent needs to master a complex technology tree to finish the task. KITCHEN is a continuous robot table-top manipulation domain implemented in Robo Suite [36].
Dataset Splits	No	The paper mentions training dynamics models on 'pre-collected transition data' and evaluating them on 'unseen episodes', but it does not specify explicit training, validation, and test dataset splits with percentages or sample counts for the main RL task.
Hardware Specification	Yes	The experiments were conducted on machines of the following configurations: Nvidia 2080 Ti GPU; AMD Ryzen Threadripper 3970X 32-Core Processor Nvidia A40 GPU; Intel(R) Xeon(R) Gold 6342 CPU @2.80GHz Nvidia A100 GPU; Intel(R) Xeon(R) Gold 6342 CPU @2.80GHz
Software Dependencies	No	The paper mentions software components like PPO, Adam optimizer, and specific environments (e.g., Mini-BEHAVIOR, Robo Suite), but it does not specify version numbers for any of these software dependencies.
Experiment Setup	Yes	The hyperparameters used for evaluating local dependency detection of each method are provided in Table 3. Unless specified otherwise, the parameters are shared across all environments. During policy learning, all methods share the same PPO and training hyperparameters, provided in Table 4.