Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SiT: Symmetry-invariant Transformers for Generalisation in Reinforcement Learning

Authors: Matthias Weissenbacher, Rishabh Agarwal, Yoshinobu Kawahara

ICML 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We showcase Si T s superior generalization over Vi Ts on Mini Grid and Procgen RL benchmarks, and its sample efficiency on Atari 100k and CIFAR10. and 5. Empirical evaluation
Researcher Affiliation Collaboration 1RIKEN Center for Advanced Intelligence Project, Tokyo, Japan 2Google DeepMind 3Graduate School of Information Science and Technology, Osaka University, Japan.
Pseudocode Yes Listing 1. Pseudocode for GSA (Py Torch-like). Changes relative to self-attention in brown.
Open Source Code Yes We open sourced the Si T model-code on Git Hub .
Open Datasets Yes We showcase Si T s superior generalization over Vi Ts on Mini Grid and Procgen RL benchmarks, and its sample efficiency on Atari 100k and CIFAR10. and CIFAR10 dataset (Krizhevsky & Hinton, 2009)
Dataset Splits No The paper describes training and testing procedures but does not explicitly mention a validation dataset split (e.g., percentages or counts for training/validation/test sets).
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, or cloud computing specifications used for running experiments.
Software Dependencies No The paper mentions software frameworks like IMPALA and torchbeast but does not provide specific version numbers for these or any other key software dependencies (e.g., PyTorch, CUDA, Python).
Experiment Setup Yes We use the stated number of local and global GSA with an embedding dimension of 64 and 8 heads. and Compared to the Res Net baseline (Raileanu et al., 2020), we employ larger batch-size 96 (instead 8) and PPO-epoch of 2 (instead 3).