Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning

Authors: Yantian Zha, Lin Guan, Subbarao Kambhampati

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results show that an RLf D model can be improved by using our SERLf D framework in terms of training stability and performance.
Researcher Affiliation Academia Arizona State University {yantian.zha,guanlin,rao}@asu.edu
Pseudocode Yes Algorithm 1: The SERLf D Learning Algorithm
Open Source Code Yes To foster further research in self-explanation-guided robot learning, we have made our demonstrations and code publicly accessible at https://github.com/YantianZha/SERLfD.
Open Datasets Yes To foster further research in self-explanation-guided robot learning, we have made our demonstrations and code publicly accessible at https://github.com/YantianZha/SERLfD.
Dataset Splits No The paper does not explicitly mention training/validation/test splits or cross-validation methodology.
Hardware Specification No The paper describes the simulated robot and environment (Fetch Mobile Manipulator in PyBullet simulator) but does not provide specific details about the computing hardware (e.g., CPU, GPU, memory) used for experiments.
Software Dependencies No The paper mentions software components like 'PyBullet simulator', 'Twin-Delayed DDPG (TD3)', and 'Soft-Actor Critic (SAC)', but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Each training episode had a maximum of 50 steps.