reproducibilityindex.ai

Learning from Interventions Using Hierarchical Policies for Safe Learning

Authors: Jing Bi, Vikas Dhiman, Tianyou Xiao, Chenliang Xu10352-10360

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that Lf I using sub-goals in a hierarchical policy framework trains faster and achieves better asymptotic performance than typical Lf D.
Researcher Affiliation	Academia	1University of Rochester 2University of California San Diego
Pseudocode	Yes	Algorithm 1 Learn-form-Intervention by Backtracking
Open Source Code	No	The paper mentions including a demo video in the supplementary material but does not explicitly state that the source code for their methodology is publicly available or provide a link.
Open Datasets	Yes	We use a 3D urban driving simulator CARLA (Dosovitskiy et al. 2017).
Dataset Splits	No	The paper mentions collecting data for training and evaluating the agent for certain durations ('30 mins recorded data', 'test agent for 15 mins') but does not specify explicit training/validation/test dataset splits (e.g., percentages or sample counts).
Hardware Specification	Yes	We equipped an off-the-shelf 1/10 scale (13 10 11 ) truck with an embedded computer (Nvidia TX2), an Intel Real Sense D415 as the primary central camera and two webcams on the sides.
Software Dependencies	No	The paper mentions using the CARLA simulator and ResNet-50 but does not provide specific version numbers for these or other software dependencies like programming languages, frameworks, or libraries.
Experiment Setup	Yes	The Equation 8 is minimized with a learning rate of 1e-5 using Adam solver. For each experiment, we use behavior cloning with 30 mins recorded data (~7200 frames) in our first iteration and test agent for 15 mins in each subsequent iteration. We initiate Res Net-50 with pre-trained parameters and only fine-tune the top three stages. We use ELU nonlinearities after all hidden layers and applied 50% dropout after fully-connected hidden layers.