Vector Quantized Models for Planning

Authors: Sherjil Ozair, Yazhe Li, Ali Razavi, Ioannis Antonoglou, Aaron Van Den Oord, Oriol Vinyals

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted two sets of experiments: in Section 5.1, we use chess as a test-bed to show that we can drop some of the assumptions and domain knowledge made in prior work, whilst still achieving state-of-the-art performance; in Section 5.2, with a rich 3D environment (Deep Mind Lab), we probe the ability of the model in handling large visual observations in partially observed environment and producing high quality rollouts without degradation.
Researcher Affiliation Collaboration 1Deep Mind, London, United Kingdom 2Mila, University of Montreal.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We use a combination of Million Base dataset (2.5 million games) and FICS Elo >2000 dataset (960k games)2. The validation set consists of 45k games from FICS Elo>2000 from 2017.
Dataset Splits Yes The validation set consists of 45k games from FICS Elo>2000 from 2017.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers.
Experiment Setup Yes The quantization layer has 2 codebooks, each of them has 128 codes of 64 dimensions. The final discrete latent is formed by concatenating the 2 codes, forming a 2-hot encoding vector. (...) The codebook has 512 codes of 64 dimensions. Then as discussed in Section 4, in the second stage, we train the state VQVAE on top of the frame-level VQ representations. This model captures the temporal dynamics of trajectories in a second latent layer, consisting of a stack of 32 separate latent variables at each timestep, each with their separate codebook comprising of 512 codes of 64 dimensions.