reproducibilityindex.ai

Assumed Density Filtering Q-learning

Authors: Heejin Jeong, Clark Zhang, George J. Pappas, Daniel D. Lee

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results demonstrate that ADFQ outperforms comparable algorithms on various Atari 2600 games, with drastic improvements in highly stochastic domains or domains with a large action space.
Researcher Affiliation	Academia	1University of Pennsylvania, Philadelphia, PA 19104 2Cornell Tech, New York, NY 10044
Pseudocode	Yes	Algorithm 1 ADFQ algorithm
Open Source Code	Yes	Example source code is available online1. 1https://github.com/coco66/ADFQ
Open Datasets	Yes	We tested on six Atari games, Enduro (\|A\| = 9), Boxing (\|A\| = 18), Pong (\|A\| = 6), Asterix (\|A\| = 9), Kung-Fu Master (\|A\| = 14), and Breakout (\|A\| = 4), from the Open AI gym simulator [Brockman et al., 2016].
Dataset Splits	No	Each learning was greedily evaluated at every epoch (= TH/100) for 3 times, and their averaged results are presented in Fig.5. The entire experiment was repeated for 3 random seeds.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory amounts, or cloud instance types) used for running experiments were provided in the paper.
Software Dependencies	No	For baselines, we used DQN and Double DQN with prioritized experience replay implemented in Open AI baselines2.
Experiment Setup	Yes	We used prioritized experience replay [Schaul et al., 2015] and a combined Huber loss functions of mean and variance. (...) We used ϵ-greedy action policy with ϵ annealed from 1.0 to 0.01 for the baselines as well as ADFQ. (...) Rewards were normalized to { 1, 0, 1} and different from raw scores of the games.