Explicable Policy Search
Authors: Ze Gong, Yu ("Tony") Zhang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate EPS in a set of navigation domains with synthetic human models and in an autonomous driving domain with a user study. The results suggest that our method can generate explicable behaviors that reconcile task performance with human expectations intelligently and has real-world relevance in human-agent teaming domains. |
| Researcher Affiliation | Academia | Ze Gong Arizona State University Tempe, AZ 85281 zgong11@asu.edu Yu Zhang Arizona State University Tempe, AZ 85281 yzhan442@asu.edu |
| Pseudocode | No | The algorithm for EPS is in the appendix. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See supplemental material. |
| Open Datasets | No | The paper mentions synthetic human models and a simulated autonomous driving domain but does not provide concrete access information like URLs, DOIs, or specific citations for publicly available datasets. |
| Dataset Splits | No | The paper does not explicitly state training/validation/test dataset splits with percentages or sample counts in the main text. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions using Soft Actor Critic (SAC) [21] but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | No | The paper mentions tuning a reconciliation parameter λ for EPS but does not provide a comprehensive list of hyperparameters or detailed system-level training settings in the main text. It refers to the appendix for more details on λ tuning but not general setup. |