Learning Symmetric Embeddings for Equivariant World Models
Authors: Jung Yeon Park, Ondrej Biza, Linfeng Zhao, Jan-Willem Van De Meent, Robin Walters
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate this approach in the context of equivariant transition models with 3 distinct forms of symmetry. Our experiments demonstrate that SENs facilitate the application of equivariant networks to data with complex symmetry representations. Moreover, doing so can yield improvements in accuracy and generalization relative to both fully-equivariant and non-equivariant baselines. |
| Researcher Affiliation | Academia | 1Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA 2Informatics Institute, University of Amsterdam, Amsterdam, Netherlands. |
| Pseudocode | No | The paper does not contain a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The code for our implementation is available1. 1https://github.com/jypark0/sen |
| Open Datasets | Yes | For example, consider the pairs of images in Figure 1. On the left, we have MNIST digits where a 2D rotation in pixel space induces a corresponding rotation in feature space. Here an E(2)-equivariant network achieves state of the art accuracy (Weiler & Cesa, 2019). |
| Dataset Splits | No | The paper mentions 'training and evaluation datasets' but does not explicitly detail a separate 'validation' split with percentages or counts, or a specific splitting methodology beyond using different seeds for evaluation data. |
| Hardware Specification | Yes | Most experiments were run on a single Nvidia RTX 2080Ti except for 3D Blocks which used a single Nvidia P100 12GB. |
| Software Dependencies | No | The paper does not provide specific software dependency versions (e.g., Python 3.x, PyTorch 1.x, CUDA 11.x), only mentions general techniques or library names without version numbers. |
| Experiment Setup | Yes | For training, we use 1000 episodes of length 100 as training data... For the object-oriented environments, we follow the hyperparameters used in (Kipf et al., 2020): a learning rate of 5 10 4, batch size of 1024, 100 epochs, and the hinge margin γ = 1. We find that these hyperparameters work well for all other environments, except that Reacher uses a batch size of 256 and mixed precision training was used for both non-equivariant, fully-equivariant, and our method, in order to keep the batch size relatively high for stable contrastive learning. |