Compressive Visual Representations
Authors: Kuang-Huei Lee, Anurag Arnab, Sergio Guadarrama, John Canny, Ian Fischer
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments confirm that adding compression to Sim CLR and BYOL significantly improves linear evaluation accuracies and model robustness across a wide range of domain shifts. 3 Experimental Evaluation We first describe our experimental set-up in Sec. 3.1, before evaluating the image representations learned by our self-supervised models in linear evaluation settings in Sec. 3.2. We then analyse the robustness and generalization of our self-supervised representations by evaluating model accuracy across a wide range of domain and distributional shifts in Sec. 3.3. Finally, we analyse the effect of compression strength in Sec. 3.4. |
| Researcher Affiliation | Collaboration | Kuang-Huei Lee Google Research leekh@google.com Anurag Arnab Google Research aarnab@google.com Sergio Guadarrama Google Research sguada@google.com John Canny Google Research canny@google.com Ian Fischer Google Research iansf@google.com John Canny is also affiliated with the University of California, Berkeley. |
| Pseudocode | Yes | Pseudocode can be found in Sec. H. |
| Open Source Code | Yes | 1Code available at https://github.com/google-research/compressive-visual-representations |
| Open Datasets | Yes | We assess the performance of representations pretrained on the Image Net training set [60] without using any labels. |
| Dataset Splits | Yes | Linear evaluation on Image Net. We first evaluate the representations learned by our models by training a linear classifier on top of frozen features on the Image Net training set, following standard practice [12, 30, 43, 44]. Learning with a few labels on Image Net. After self-supervised pretraining on Image Net, we learn a linear classifier on a small subset (1% or 10%) of the Image Net training set, using the class labels this time, following the standard protocol of [12, 30]. Image Net-C [37] adds synthetic corruptions to the Image Net validation set. |
| Hardware Specification | Yes | As in Sim CLR and BYOL, we use batch size of 4096 split over 64 Cloud TPU v3 cores. |
| Software Dependencies | No | The paper mentions software components like "LARS optimizer" and "cosine decay learning rate schedule" and refers to public implementations, but it does not provide specific version numbers for these software components or the underlying frameworks (e.g., TensorFlow, PyTorch). |
| Experiment Setup | Yes | We use the same set of image augmentations as in BYOL [30] for both BYOL and Sim CLR, and also use BYOL s (4096, 256) two-layer projection head for both methods. We follow Sim CLR and BYOL to use the LARS optimizer [74] with a cosine decay learning rate schedule [49] over 1000 epochs with a warm-up period, as detailed in Sec. A.4. For ablation experiments we train for 300 epochs instead. As in Sim CLR and BYOL, we use batch size of 4096 split over 64 Cloud TPU v3 cores. Except for ablation studies of compression strength, β is set to 1.0 for both C-Sim CLR and C-BYOL. |