reproducibilityindex.ai

Verifying Reinforcement Learning up to Infinity

Authors: Edoardo Bacci, Mirco Giacobbe, David Parker

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate its efﬁcacy on a range of benchmark control problems. and We evaluate our method over multiple agents for 3 benchmark control problems: a bouncing ball, automated cruise control, and cart-pole. ... Results are shown in Tab. 1
Researcher Affiliation	Academia	Edoardo Bacci1 , Mirco Giacobbe2 , David Parker1 1University of Birmingham 2University of Oxford
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. The methodology is described using text and mathematical equations.
Open Source Code	Yes	1https://github.com/phate09/Safe RL Inﬁnity
Open Datasets	Yes	We evaluate our method over multiple agents for 3 benchmark control problems: a bouncing ball, automated cruise control, and cart-pole. We used standard feed forward architectures... [Jaeger et al., 2019; Tran et al., 2020; Brockman et al., 2016].
Dataset Splits	No	The paper describes training RL agents in simulated environments but does not provide specific training/validation/test dataset splits as it's not a traditional supervised learning setup with fixed datasets.
Hardware Specification	Yes	We ran our experiments on a 4-core 4.2GHz with 64GB RAM.
Software Dependencies	No	The paper mentions using 'proximal policy optimisation (PPO)' and 'Open AI Gym' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We used standard feed forward architectures with 2 hidden layers of size 64 (32 for the bouncing ball), and Re LU activation functions; we used a learning rate of 5e 4. and We terminate training either when our agent reaches a mean reward of 900 or after 5M training steps.