What Should Not Be Contrastive in Contrastive Learning
Authors: Tete Xiao, Xiaolong Wang, Alexei A Efros, Trevor Darrell
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We train our model on the 100-category Image Net (IN-100) dataset... We test the models on various downstream datasets... Table 1: Classification accuracy... Table 2: Evaluation on multiple downstream tasks... Table 3: Evaluation on datasets of real-world corruptions... Ablation: Mo Co w/ all augmentations vs. Loo C. |
| Researcher Affiliation | Academia | Tete Xiao UC Berkeley Xiaolong Wang UC San Diego Alexei A. Efros UC Berkeley Trevor Darrell UC Berkeley |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | No explicit statement regarding the release of source code for the described methodology or a link to a code repository was found. |
| Open Datasets | Yes | We train our model on the 100-category Image Net (IN-100) dataset, a subset of the Image Net (Deng et al., 2009) dataset, for fast ablation studies of the proposed framework. We split the subset following (Tian et al., 2019). |
| Dataset Splits | Yes | IN-100 validation set;... The i Naturalist 2019 (i Nat-1k) dataset (Van Horn et al., 2018)... We randomly reallocate 10% of training images into the validation set as the original validation set is relatively small. |
| Hardware Specification | No | No specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models (e.g., Intel Core i7), or cloud instance types were mentioned for running experiments. The paper only refers to general training processes without specifying the computational infrastructure. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'CUDA 11.1') were found in the paper. |
| Experiment Setup | Yes | We train the network for 500 epochs, and decrease the learning rate at 300 and 400 epochs. We use separate queues (He et al., 2020) for individual embedding space and set the queue size to 16,384. ...The batch size during training of the backbone and the linear layer is set to 256. ...We train the linear layer for 200 epochs for IN-100 and CUB-200, 100 epochs for i Nat-1k, optimized by momentum SGD with a learning rate of 30 decreased by 0.1 at 60% and 80% of training schedule; for Flowers-102 we train the linear layer with Adam optimizer for 250 iterations with a learning rate of 0.03. |