Efficient World Models with Context-Aware Tokenization

Authors: Vincent Micheli, Eloi Alonso, François Fleuret

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the Crafter benchmark, -IRIS sets a new state of the art at multiple frame budgets, while being an order of magnitude faster to train than previous attention-based approaches.
Researcher Affiliation Academia 1University of Geneva, Switzerland. Correspondence to: <first.last@unige.ch>.
Pseudocode No No structured pseudocode or algorithm blocks labeled 'Pseudocode' or 'Algorithm' were found.
Open Source Code Yes We release our code and models at https://github.com/vmicheli/delta-iris.
Open Datasets Yes In our experiments, we consider the Crafter benchmark (Hafner, 2022) to illustrate -IRIS ability to scale to a visually rich environment with large frame budgets. Besides, we also include Atari 100k games (Bellemare et al., 2013; Kaiser et al., 2020) in Appendix C to showcase the performance and speed of our agent in the sample-efficient setting.
Dataset Splits No The paper mentions evaluating on 'test episodes' and training on 'collected frames' but does not specify a distinct training/validation/test split with percentages or counts for hyperparameter tuning or early stopping.
Hardware Specification Yes Our experiments run on a Nvidia A100 40GB GPU, with 5 seeds for all methods and ablations.
Software Dependencies No The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup Yes Appendix A. 'Architectures and hyperparameters' provides detailed hyperparameters for the discrete autoencoder (e.g., Frame dimensions (h, w) 64 64, Layers 5, Vocabulary size 1024, Tokens per frame 4), autoregressive transformer (Timesteps 21, Embedding dimension 512, Layers 3, Attention heads 8), Actor-Critic (H = 15, discount factor γ = 0.997, λ-returns parameter 0.95), and shared training parameters (Autoencoder batch size 32, Transformer batch size 32, Learning rate 1e-4, Optimizer Adam).