Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation

Authors: Shani Gamrian, Yoav Goldberg

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate this approach on Breakout and Road Fighter in Section 5, and present the results comparing to different baselines.
Researcher Affiliation Collaboration 1Computer Science Department, Bar-Ilan University, Ramat Gan, Israel 2Allen Institute for Artificial Intelligence.
Pseudocode Yes Algorithm 1 Imitation Learning
Open Source Code Yes The code is available at https://github.com/Shani Gam/RL-GAN.
Open Datasets Yes In this work, we first focus on the Atari game Breakout, in which the main concept is moving the paddle towards the ball in order to maximize the score of the game. We explore the Nintendo game Road Fighter, a car racing game where the goal is to finish the track before the time runs out without crashing.
Dataset Splits No The paper describes collecting images and training for a number of iterations but does not specify explicit training, validation, and test dataset splits with percentages or counts.
Hardware Specification No The paper does not specify any particular GPU, CPU, or other hardware models used for running the experiments.
Software Dependencies No The paper mentions algorithms (A3C, A2C) and frameworks (UNIT, Cycle GAN) but does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries.
Experiment Setup Yes We train each one of the tasks (before and after the transformation) for 60 million frames, and our evaluation metric is the total reward the agents collect in an episode averaged by the number of episodes... For our experiments we use the same architecture and hyperparameters proposed in the UNIT paper. We initialize the weights with Xavier initialization (Glorot & Bengio, 2010), set the batch size to 1 and train the network for a different number of iterations on each task.