reproducibilityindex.ai

Neurosymbolic Reinforcement Learning with Formally Verified Exploration

Authors: Greg Anderson, Abhinav Verma, Isil Dillig, Swarat Chaudhuri

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation, on a suite of continuous control problems, shows that REVEL enforces safe exploration in many scenarios where established RL algorithms (including CPO [1], which is motivated by safe RL) do not, while discovering policies that outperform policies based on static shields.
Researcher Affiliation	Academia	Greg Anderson UT Austin ganderso@cs.utexas.edu Abhinav Verma UT Austin verma@utexas.edu Isil Dillig UT Austin isil@cs.utexas.edu Swarat Chaudhuri UT Austin swarat@cs.utexas.edu
Pseudocode	Yes	Algorithm 1 Reinforcement Learning with Formally Veriﬁed Exploration (REVEL) and Algorithm 2 Implementation of PROJECTG
Open Source Code	Yes	The current implementation is available at https://github.com/gavlegoat/safe-learning.
Open Datasets	No	Our experiments used 10 benchmarks that include classic control problems, robotics applications, and benchmarks from prior work [11]. For each of these environments, we hand-constructed a worst-case, piecewise linear model of the dynamics. The paper refers to benchmark environments and hand-constructed models, but does not provide concrete access information (link, DOI, or formal citation with authors/year for the specific datasets used for training) for any publicly available datasets.
Dataset Splits	No	The paper discusses training performance and benchmarks but does not provide specific details on dataset splits (e.g., percentages or counts for training, validation, or test sets).
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, library versions, or specific solver versions) needed to replicate the experiments.
Experiment Setup	Yes	Details of hyperparameters that we used appear in the Appendix.