Constrained Cross-Entropy Method for Safe Reinforcement Learning
Authors: Min Wen, Ufuk Topcu
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show with simulation experiments that the proposed algorithm can effectively learn feasible policies without assumptions on the feasibility of initial policies, even with non-Markovian objective functions and constraint functions. |
| Researcher Affiliation | Academia | Min Wen Department of Electrical and Systems Engineering University of Pennsylvania wenm@seas.upenn.edu |
| Pseudocode | Yes | Algorithm 1 Constrained Cross-Entropy Method |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the described methodology. |
| Open Datasets | No | The paper describes a 'mobile robot navigation task' in a simulated environment but does not mention using a publicly available dataset, nor does it provide concrete access information (link, DOI, citation) for any data used. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits. It discusses evaluation based on 'learning curves' but not data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. It only mentions implementation in 'rllab [5]'. |
| Software Dependencies | No | The paper states 'All experiments are implemented in rllab [5]' but does not provide version numbers for rllab or any other software dependencies. |
| Experiment Setup | Yes | For all experiments, the agent s policy is modeled as a fully connected neural network with two hidden layers with 30 nodes in each layer. Trajectory length for all experiments is set to N = 30. |