reproducibilityindex.ai

Unsupervised State Representation Learning in Atari

Authors: Ankesh Anand, Evan Racah, Sherjil Ozair, Yoshua Bengio, Marc-Alexandre Côté, R Devon Hjelm

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We introduce a new benchmark based on Atari 2600 games where we evaluate representations based on how well they capture the ground truth state variables. ... Finally, we compare our technique with other state-of-the-art generative and contrastive representation learning methods.
Researcher Affiliation	Collaboration	Ankesh Anand Mila, Université de Montréal; Microsoft Research; Evan Racah Mila, Université de Montréal; Sherjil Ozair Mila, Université de Montréal; Yoshua Bengio Mila, Université de Montréal; Marc-Alexandre Côté Microsoft Research; R Devon Hjelm Microsoft Research Mila, Université de Montréal
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code associated with this work is available at https://github.com/mila-iqia/atari-representation-learning
Open Datasets	Yes	To systematically evaluate the ability of different representation learning methods at capturing the true underlying factors of variation, we propose a benchmark based on Atari 2600 games using the Arcade Learning Environment [ALE, 28]. ... We make this available with an easy-to-use gym wrapper, which returns this information with no change to existing code using gym interfaces.
Dataset Splits	Yes	We train each linear probe with 35,000 frames and use 5,000 and 10,000 frames each for validation and test respectively.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions "Py Torch [79] and Weights&Biases" but does not specify version numbers for these software components.
Experiment Setup	Yes	For all our experiments, we used = 0.2. All methods use the same base encoder architecture, which is the CNN from [73], but adapted for the full 160x210 Atari frame size. To ensure a fair comparison, we use a representation size of 256 for each method. We train each linear probe with 35,000 frames and use 5,000 and 10,000 frames each for validation and test respectively. We use early stopping and a learning rate scheduler based on plateaus in the validation loss.