Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Neurosymbolic Reinforcement Learning with Formally Verified Exploration

Authors: Greg Anderson, Abhinav Verma, Isil Dillig, Swarat Chaudhuri

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical evaluation, on a suite of continuous control problems, shows that REVEL enforces safe exploration in many scenarios where established RL algorithms (including CPO [1], which is motivated by safe RL) do not, while discovering policies that outperform policies based on static shields.
Researcher Affiliation Academia Greg Anderson UT Austin EMAIL Abhinav Verma UT Austin EMAIL Isil Dillig UT Austin EMAIL Swarat Chaudhuri UT Austin EMAIL
Pseudocode Yes Algorithm 1 Reinforcement Learning with Formally Veri๏ฌed Exploration (REVEL) and Algorithm 2 Implementation of PROJECTG
Open Source Code Yes The current implementation is available at https://github.com/gavlegoat/safe-learning.
Open Datasets No Our experiments used 10 benchmarks that include classic control problems, robotics applications, and benchmarks from prior work [11]. For each of these environments, we hand-constructed a worst-case, piecewise linear model of the dynamics. The paper refers to benchmark environments and hand-constructed models, but does not provide concrete access information (link, DOI, or formal citation with authors/year for the specific datasets used for training) for any publicly available datasets.
Dataset Splits No The paper discusses training performance and benchmarks but does not provide specific details on dataset splits (e.g., percentages or counts for training, validation, or test sets).
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, library versions, or specific solver versions) needed to replicate the experiments.
Experiment Setup Yes Details of hyperparameters that we used appear in the Appendix.