Self-supervised Transformation Learning for Equivariant Representations
Authors: Jaemyung Yu, Jaehyun Choi, DongJae Lee, HyeongGwon Hong, Junmo Kim
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the approach s effectiveness across diverse classification and detection tasks, outperforming existing methods in 7 out of 11 benchmarks and excelling in detection. Additionally, by incorporating Aug Mix, a complex transformation that is not feasible with existing equivariant learning, STL enhances performance across all tasks, showcasing its broad applicability. STL s compatibility with diverse base models, further highlights its versatility, as it achieves the highest average accuracy across foundational models. Extensive experiments and ablation studies further validate STL s ability to capture interdependencies among transformations in an unsupervised manner. |
| Researcher Affiliation | Academia | Jaemyung Yu1 Jaehyun Choi1 Dong-Jae Lee1 Hyeong Gwon Hong1 Junmo Kim1 1Korea Advanced Institute of Science and Technology (KAIST) {jaemyung,chlwogus,jhtwosun,honggudrnjs,junmo.kim}@kaist.ac.kr |
| Pseudocode | No | The paper describes the methodology in text and mathematical formulas but does not provide pseudocode or an algorithm block. |
| Open Source Code | Yes | The code is available at https://github.com/jaemyung-u/stl. |
| Open Datasets | Yes | We pretrain on STL10 [7] with Res Net-18 and Image Net100 [39, 42] with Res Net-50, following the split in [42]. Table 10: Dataset Information. Overview of dataset composition and evaluation metrics. |
| Dataset Splits | Yes | Table 10: Dataset Information. Overview of dataset composition and evaluation metrics. Each dataset specifies the number of classes, training/validation/test splits, and the corresponding evaluation metric. For linear evaluation dataset, validation samples are randomly selected from the training split if an official validation split is not provided. |
| Hardware Specification | Yes | For the pretraining experiments, we use NVIDIA RTX4090. |
| Software Dependencies | No | The paper mentions various architectures and methods but does not provide specific version numbers for software libraries like PyTorch, TensorFlow, or other key dependencies used in the implementation. |
| Experiment Setup | Yes | For STL and explicit baselines (SEN, Equi Mod, and SIE), we use an equivariant transformation network with a hypernetwork based on SIE... STL also includes a 3-layer MLP with a 512-dimensional hidden layer to encode 128-dimensional transformation representations from input pairs... The transformation prediction loss weight is set to 0.5 for implicit baselines, and the equivariant learning weight is set to 1 for explicit baselines. STL uses weights of 1, 1, and 0.2 for invariant, equivariant, and transformation learning losses, respectively. All methods are trained for 500 epochs with a batch size of 256, using a cosine learning rate schedule without restarts [30]. The initial learning rate is set at 0.03, with a weight decay of 0.0005. The model includes a 3-layer projection MLP head, g( ), with a hidden dimension of 2048 and an output dimension of 128. Batch normalization [24] is excluded from the last layer. |