reproducibilityindex.ai

Maximum Entropy RL (Provably) Solves Some Robust RL Problems

Authors: Benjamin Eysenbach, Sergey Levine

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental evaluation shows that, in line with our theoretical ﬁndings, simple Max Ent RL algorithms perform competitively with (and sometimes better than) recently proposed adversarial robust RL methods on benchmarks proposed by those works.
Researcher Affiliation	Collaboration	Benjamin Eysenbach Carnegie Mellon University, Google Brain beysenba@cs.cmu.edu; Sergey Levine UC Berkeley, Google Brain
Pseudocode	No	The paper contains mathematical derivations and proofs but does not include any sections or figures explicitly labeled as 'Pseudocode' or 'Algorithm' with structured steps.
Open Source Code	No	The paper mentions using and modifying existing open-source implementations (e.g., 'SAC implementation from TF Agents', 'modifying the open source code released by Tessler et al. (2019)') but does not state that the authors' own code for the methodology described in the paper is open-source or available.
Open Datasets	Yes	We used the standard Pusher-v2 task from Open AI Gym (Brockman et al., 2016). We used the Sawyer Button Press Env environment from Metaworld (Yu et al., 2020), using a maximum episode length of 151. ...four continuous control tasks from the standard Open AI Gym (Brockman et al., 2016) benchmark.
Dataset Splits	No	The paper describes evaluation metrics and environmental conditions for experiments but does not provide specific training/validation/test dataset splits (e.g., percentages or exact counts) or reference standard splits from the cited datasets.
Hardware Specification	No	The paper describes the software environments and implementations used (e.g., 'SAC implementation from TF Agents'), but it does not provide any specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for running the experiments.
Software Dependencies	No	The paper mentions several software components and frameworks, such as 'TF Agents (Guadarrama et al., 2018)', 'Open AI Gym (Brockman et al., 2016)', and 'Metaworld (Yu et al., 2020)', but it consistently omits specific version numbers for these dependencies.
Experiment Setup	Yes	We used a fixed entropy coefﬁcient of 1e-2 for the Max Ent RL results. For the standard RL results, we used the exact same codebase to avoid introducing any confounding factors, simply setting the entropy coefﬁcient to a very small value 1e-5. ...We used 100 episodes of length 100 for evaluating each method. ...We used an entropy coefﬁcient of 1e1 for Max Ent RL and 1e-100 for standard RL.