Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding
Authors: Devleena Das, Sonia Chernova, Been Kim
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental validations, in Connect 4 and Lunar Lander, demonstrate the success of S2E in providing a dual-benefit, successfully informing reward shaping and improving agent learning rate, as well as significantly improving end user task performance at deployment time. |
| Researcher Affiliation | Collaboration | Devleena Das School of Interactive Computing Georgia Institute of Technology EMAIL Sonia Chernova School of Interactive Computing Georgia Institute of Technology EMAIL Been Kim Google Research EMAIL |
| Pseudocode | No | The paper includes architectural diagrams and descriptions of its framework, but it does not contain explicit pseudocode blocks or algorithms labeled as such. |
| Open Source Code | Yes | See link to code in Appendix F. https://anonymous.4open.science/r/S2E/README.md |
| Open Datasets | Yes | Lunar Lander is a trajectory optimization problem in which the lander must land on a landing pad. We utilize Lunar Lander-v2 from Open AI Gym [4]. |
| Dataset Splits | Yes | To train and evaluate MC4 and MLL, we utilize a 60%-20%-20% train-valid-test split on DC4 and DLL (see total dataset size in Appendix C.1). |
| Hardware Specification | Yes | To train MC4 and MLL, we utilize a desktop computer with a NVIDIA GTX 1060 6GB GPU and an Intel i7 processor. |
| Software Dependencies | No | The paper mentions software like 'Lunar Lander-v2 from Open AI Gym [4]' and 'Mu Zero [52]', but it does not provide specific version numbers for software dependencies such as libraries or programming languages. |
| Experiment Setup | Yes | The models are trained with learning rate of 0.001, batch size of 128, Adam Optimizer, and trained with 10 epochs. |