reproducibilityindex.ai

Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks

Authors: Jesse Farebrother, Joshua Greaves, Rishabh Agarwal, Charline Le Lan, Ross Goroshin, Pablo Samuel Castro, Marc G Bellemare

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through a series of experiments on the Arcade Learning Environment, we demonstrate that proto-value networks produce rich features that may be used to obtain performance comparable to established algorithms, using only linear approximation and a small number (~4M) of interactions with the environment s reward function.
Researcher Affiliation	Collaboration	1 Mc Gill University, 2 Université de Montréal, 3 Mila Québec AI Institute, 4 University of Oxford 5 Google Research Brain Team
Pseudocode	Yes	Algorithm 1 gives pseudo-code for the method as implemented with a ﬁxed replay memory.
Open Source Code	Yes	We have released a reference implementation along with notebooks demonstrating how to download and use our pre-trained representations at: https://github.com/google-research/google-research/tree/master/pvn.
Open Datasets	Yes	During the representation pre-training phase, we use transition data from ofﬂine Atari datasets in RL Unplugged (Agarwal et al., 2020; Gulcehre et2 al., 2020)
Dataset Splits	No	The paper mentions 'pre-training phase,' 'online RL phase,' and 'evaluation' but does not explicitly provide details about specific training/validation/test dataset splits, such as percentages or sample counts for data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper lists several software packages and libraries (e.g., Jax, Flax, Optax, Numpy, Pandas, Matplotlib, Seaborn) with citations, but it does not provide specific version numbers for these software dependencies, which is necessary for reproducible ancillary software details.
Experiment Setup	Yes	In the tables below we report all relevant hyperparameter choices for both our ofﬂine pre-training phase, and online learning phase. Table 1: PVN Hyperparameters. Table 2: Online Hyperparameters.