Representation Learning via Consistent Assignment of Views over Random Partitions
Authors: Thalles Santos Silva, Adín Ramírez Rivera
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through an extensive evaluation, we demonstrate that CARP s representations are suitable for learning downstream tasks. We evaluate CARP s representations capabilities in 17 datasets across many standard protocols, including linear evaluation, few-shot classification, k-NN, kmeans, image retrieval, and copy detection. We compare CARP performance to 11 existing self-supervised methods. We extensively ablate our method and demonstrate that our proposed random partition pretext task improves the quality of the learned representations by devising multiple random classification tasks. |
| Researcher Affiliation | Academia | Thalles Silva Institute of Computing University of Campinas thalles.silva@students.ic.unicamp.br Adín Ramírez Rivera Department of Informatics University of Oslo adinr@uio.no |
| Pseudocode | Yes | E Pseudocode of CARP in a Py Torch-like Style |
| Open Source Code | Yes | Code at https://sthalles.github.io/carp/. |
| Open Datasets | Yes | Table 2 reports clustering performance metrics of various clustering-based SSL methods on the Image Net-1M [36], CIFAR-10/100 [27], and the GTSRB [40] datasets. |
| Dataset Splits | Yes | For Image Net-1M evaluation, we trained a linear classifier on top of the frozen representations extracted from the last average pooling layer of the Res Net50 encoder for 100 epochs, following Zhou et al. s [49] protocol. ... We use the validation split to assess the quality of the learned prototypes. |
| Hardware Specification | Yes | For all experiments, we used 4 A100 40GB GPUs and gradient accumulation to simulate large batch sizes. |
| Software Dependencies | No | The paper mentions 'Py Torch style pseudo-code' and 'faiss library [25]', but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We train CARP on the Image Net-1M unlabeled dataset using Res Net50 [23] encoders. We take the output representation of the last global average pooling layer (a 2048-dim vector) and project it to a 256-dim vector. ... The hidden units of the projection head contain 2048 neurons. ... K = 65 536 prototypes. ... NP = 128, which creates subsets containing NB = 512 randomly chosen prototypes. ... CARP is pre-trained with the LARS [47] optimizer, end to end, with weight decay of 1 10 6. For models training up to 200 epochs, the learning rate starts from 0.6 and decays to 0.006 with a cosine scheduling [30] without warmups. For models pre-trained for more than 400 epochs, the learning rate starts at 0.3 and decays to 0.003 using the same cosine scheduler. We train the system with a global batch size of 4096 observations. |