reproducibilityindex.ai

Intelligent Switching for Reset-Free RL

Authors: Darshan Patil, Janarthanan Rajendran, Glen Berseth, Sarath Chandar

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we empirically analyze the performance of RISC. Specifically, we: (1) Investigate whether reverse curriculums are the best approach for reset-free RL; (2) Compare the performance of RISC to other reset-free methods on the EARL benchmark; (3) Evaluate the necessity of both; timeout-nonterminal bootstrapping and early switching for RISC with an ablation study.
Researcher Affiliation	Academia	Darshan Patil Mila, Université de Montréal Janarthanan Rajendran Dalhousie University Glen Berseth Mila, Université de Montréal Canada CIFAR AI Chair Sarath Chandar Mila, École Polytechnique de Montréal Canada CIFAR AI Chair
Pseudocode	Yes	Algorithm 1: Reset Free RL with Intelligently Switching Controller (RISC) Input : Trajectory switching probability: ζ s, g = env.reset() t = 0 check_switch = random() < ζ while True do a = agent.act(s, g) s , r = env.step(a) agent.update(s, a, r, s , g) t = t + 1 if should_switch(t, agent.Qf, s , g, check_switch) then g = switch_goals() t = 0 check_switch = random() < ζ end s = s end
Open Source Code	Yes	*Code available at https://github.com/chandar-lab/RISC.
Open Datasets	Yes	We evaluate our algorithm’s performance on the recently proposed EARL benchmark (Sharma et al., 2021b).
Dataset Splits	No	The paper does not specify explicit training/validation/test splits, but rather describes an evaluation protocol where agents are evaluated after certain timesteps in simulated environments.
Hardware Specification	No	All experiments were run as CPU jobs. The paper does not specify any particular CPU models, GPUs, or other detailed hardware specifications used for the experiments.
Software Dependencies	No	All of our agents for the experiments on the EARL benchmark (Sharma et al., 2021b) use SAC (Haarnoja et al., 2018) as the base agent. ... For the 4 rooms experiments, all agents use a DQN (Mnih et al., 2015) agent as their base. The paper mentions the use of SAC and DQN as base agents but does not provide specific version numbers for software dependencies like PyTorch, TensorFlow, Python, or CUDA.
Experiment Setup	Yes	The other hyperparameters used for the base agent are described in Table 1. The experiments on the 4-rooms gridworld (Chevalier-Boisvert et al., 2018) use DQN (Mnih et al., 2015) as the base agent. The corresponding hyperparameters for those experiments are shown in Tables 2. The additional hyperparameters for RISC are shown in Table 3.