reproducibilityindex.ai

Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach

Authors: Yudi Zhang, Yali Du, Biwei Huang, Ziyan Wang, Jun Wang, Meng Fang, Mykola Pechenizkiy

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our method outperforms state-of-the-art methods and the provided visualization further demonstrates the interpretability of our method.
Researcher Affiliation	Academia	1Eindhoven University of Technology 2King s College London 3University of California San Diego 4University College London 5University of Liverpool
Pseudocode	Yes	Algorithm 1 Learning the generative process and policy jointly.
Open Source Code	No	The project page is located at https://reedzyd.github.io/Generative Return Decomposition/. The paper provides a link to a project page, not directly to a code repository. Although the project page may link to code, the direct link provided in the paper does not meet the criteria for a specific code repository.
Open Datasets	Yes	We evaluate our method on eight widely used classical robot control tasks in the Mu Jo Co environment [55], including Half-Cheetah, Ant, Walker2d, Humanoid, Swimmer, Hopper, Humanoid Standup, and Reacher tasks.
Dataset Splits	Yes	Validation is performed after every cycle, and the average metric is computed based on 10 test rollouts.
Hardware Specification	Yes	All experiments were conducted on an HPC system equipped with 128 Intel Xeon processors operating at a clock speed of 2.2 GHz and 5 terabytes of memory.
Software Dependencies	No	The information is insufficient. The paper mentions the Adam optimizer but does not specify version numbers for any software dependencies like programming languages, frameworks (e.g., PyTorch, TensorFlow), or libraries.
Experiment Setup	Yes	Table 3: The table of the hyper-parameters used in the experiments for GRD. Table 4: The hyper-parameters.