reproducibilityindex.ai

Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation

Authors: Shani Gamrian, Yoav Goldberg

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate this approach on Breakout and Road Fighter in Section 5, and present the results comparing to different baselines.
Researcher Affiliation	Collaboration	1Computer Science Department, Bar-Ilan University, Ramat Gan, Israel 2Allen Institute for Artiﬁcial Intelligence.
Pseudocode	Yes	Algorithm 1 Imitation Learning
Open Source Code	Yes	The code is available at https://github.com/Shani Gam/RL-GAN.
Open Datasets	Yes	In this work, we ﬁrst focus on the Atari game Breakout, in which the main concept is moving the paddle towards the ball in order to maximize the score of the game. We explore the Nintendo game Road Fighter, a car racing game where the goal is to ﬁnish the track before the time runs out without crashing.
Dataset Splits	No	The paper describes collecting images and training for a number of iterations but does not specify explicit training, validation, and test dataset splits with percentages or counts.
Hardware Specification	No	The paper does not specify any particular GPU, CPU, or other hardware models used for running the experiments.
Software Dependencies	No	The paper mentions algorithms (A3C, A2C) and frameworks (UNIT, Cycle GAN) but does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries.
Experiment Setup	Yes	We train each one of the tasks (before and after the transformation) for 60 million frames, and our evaluation metric is the total reward the agents collect in an episode averaged by the number of episodes... For our experiments we use the same architecture and hyperparameters proposed in the UNIT paper. We initialize the weights with Xavier initialization (Glorot & Bengio, 2010), set the batch size to 1 and train the network for a different number of iterations on each task.