In-Context Symmetries: Self-Supervised Learning through Contextual World Models

Authors: Sharut Gupta, Chenyu Wang, Yifei Wang, Tommi Jaakkola, Stefanie Jegelka

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate significant performance gains over existing methods on equivariancerelated tasks, supported by both qualitative and quantitative evaluations.
Researcher Affiliation Academia Sharut Gupta*, Chenyu Wang*, Yifei Wang*, Tommi Jaakkola MIT CSAIL {sharut, wangchy, yifei_w, jaakkola}@mit.edu Stefanie Jegelka TU Munich, MIT CSAIL stefje@mit.edu
Pseudocode No No explicit pseudocode or algorithm block found. The methods are described in prose.
Open Source Code No Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: We are working towards organizing the code base and will make it available by the rebuttal.
Open Datasets Yes We use the 3D Invariant Equivariant Benchmark (3DIEBench) [16] and CIFAR10 to test our approach.
Dataset Splits Yes We use the standard training, validation and test splits, made publicly available by the authors [16].
Hardware Specification Yes Each experiment was conducted on 1 NVIDIA Tesla V100 GPUs, each with 32GB of accelerator RAM. The CPUs used were Intel Xeon E5-2698 v4 processors with 20 cores and 384GB of RAM.
Software Dependencies No All experiments were implemented using the Py Torch deep learning framework.
Experiment Setup Yes On all datasets, we train CONTEXTSSL with the Adam optimizer with a learning rate of 5e 5 and weight decay 1e 3. For baseline self-supervised approaches, in their original architecture, we use a learning rate of 1e 3 with no weight decay. We fix the maximum training context length to 128.