Efficient World Models with Context-Aware Tokenization
Authors: Vincent Micheli, Eloi Alonso, François Fleuret
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the Crafter benchmark, -IRIS sets a new state of the art at multiple frame budgets, while being an order of magnitude faster to train than previous attention-based approaches. |
| Researcher Affiliation | Academia | 1University of Geneva, Switzerland. Correspondence to: <first.last@unige.ch>. |
| Pseudocode | No | No structured pseudocode or algorithm blocks labeled 'Pseudocode' or 'Algorithm' were found. |
| Open Source Code | Yes | We release our code and models at https://github.com/vmicheli/delta-iris. |
| Open Datasets | Yes | In our experiments, we consider the Crafter benchmark (Hafner, 2022) to illustrate -IRIS ability to scale to a visually rich environment with large frame budgets. Besides, we also include Atari 100k games (Bellemare et al., 2013; Kaiser et al., 2020) in Appendix C to showcase the performance and speed of our agent in the sample-efficient setting. |
| Dataset Splits | No | The paper mentions evaluating on 'test episodes' and training on 'collected frames' but does not specify a distinct training/validation/test split with percentages or counts for hyperparameter tuning or early stopping. |
| Hardware Specification | Yes | Our experiments run on a Nvidia A100 40GB GPU, with 5 seeds for all methods and ablations. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Appendix A. 'Architectures and hyperparameters' provides detailed hyperparameters for the discrete autoencoder (e.g., Frame dimensions (h, w) 64 64, Layers 5, Vocabulary size 1024, Tokens per frame 4), autoregressive transformer (Timesteps 21, Embedding dimension 512, Layers 3, Attention heads 8), Actor-Critic (H = 15, discount factor γ = 0.997, λ-returns parameter 0.95), and shared training parameters (Autoencoder batch size 32, Transformer batch size 32, Learning rate 1e-4, Optimizer Adam). |