reproducibilityindex.ai

Automatic Shortcut Removal for Self-Supervised Representation Learning

Authors: Matthias Minderer, Olivier Bachem, Neil Houlsby, Michael Tschannen

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Table 1 shows evaluation results across tasks and datasets. For all tested tasks, adding the lens leads to a signiﬁcant improvement over the baseline. Further, the lens outperforms adversarial training using the fast gradient sign method (FGSM; Goodfellow et al., 2015; details in Appendix ??).
Researcher Affiliation	Industry	Matthias Minderer 1 2 Olivier Bachem 1 Neil Houlsby 1 Michael Tschannen 1 1Google Research, Brain Team, Z urich, Switzerland 2Work done as part of the Google AI Residency. Correspondence to: Matthias Minderer <mjlm@google.com>.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code for the methodology it describes.
Open Datasets	Yes	Self-supervised training is performed on Image Net, which contains 1.3 million images, each belonging to one of 1000 object categories. Unless stated otherwise, we use the same preprocessing operations and batch size as Kolesnikov et al. (2019) for the respective tasks. To mitigate distribution shift between raw and lens-processed images, we feed both the batch of lens-processed and the raw images to the feature extraction network (Kurakin et al. (2016) similarly feed processed and raw images for adversarial training).
Dataset Splits	Yes	We report top-1 classiﬁcation accuracy on the Image Net validation set. In addition, to measure how well the learned representations transfer to unseen data, we also report downstream top-1 accuracy on the Places205 dataset.
Hardware Specification	Yes	Training was performed on 128 TPU v3 cores for Rotation and Exemplar and 32 TPU v3 cores for Relative patch location and Jigsaw.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	Feature extractor and lens are trained synchronously using the Adam optimizer with β1 = 0.1, β2 = 10 3 and ϵ = 10 7 for 35 epochs. The learning rate is linearly ramped up from zero to 10 4 in the ﬁrst epoch, stays at 10 4 until the end of the 32nd epoch, and is then linearly decayed to zero.