State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding

Authors: Devleena Das, Sonia Chernova, Been Kim

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental validations, in Connect 4 and Lunar Lander, demonstrate the success of S2E in providing a dual-benefit, successfully informing reward shaping and improving agent learning rate, as well as significantly improving end user task performance at deployment time.
Researcher Affiliation Collaboration Devleena Das School of Interactive Computing Georgia Institute of Technology ddas41@gatech.edu Sonia Chernova School of Interactive Computing Georgia Institute of Technology chernova@gatech.edu Been Kim Google Research beenkim@google.com
Pseudocode No The paper includes architectural diagrams and descriptions of its framework, but it does not contain explicit pseudocode blocks or algorithms labeled as such.
Open Source Code Yes See link to code in Appendix F. https://anonymous.4open.science/r/S2E/README.md
Open Datasets Yes Lunar Lander is a trajectory optimization problem in which the lander must land on a landing pad. We utilize Lunar Lander-v2 from Open AI Gym [4].
Dataset Splits Yes To train and evaluate MC4 and MLL, we utilize a 60%-20%-20% train-valid-test split on DC4 and DLL (see total dataset size in Appendix C.1).
Hardware Specification Yes To train MC4 and MLL, we utilize a desktop computer with a NVIDIA GTX 1060 6GB GPU and an Intel i7 processor.
Software Dependencies No The paper mentions software like 'Lunar Lander-v2 from Open AI Gym [4]' and 'Mu Zero [52]', but it does not provide specific version numbers for software dependencies such as libraries or programming languages.
Experiment Setup Yes The models are trained with learning rate of 0.001, batch size of 128, Adam Optimizer, and trained with 10 epochs.