Poly-View Contrastive Learning

Authors: Amitis Shidani, R Devon Hjelm, Jason Ramapuram, Russell Webb, Eeshan Gunesh Dhekane, Dan Busbridge

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We investigate matching when there are more than two related views which we call poly-view tasks, and derive new representation learning objectives using information maximization and sufficient statistics. We show that with unlimited computation, one should maximize the number of related views, and with a fixed compute budget, it is beneficial to decrease the number of unique samples whilst increasing the number of views of those samples. In particular, poly-view contrastive models trained for 128 epochs with batch size 256 outperform Sim CLR trained for 1024 epochs at batch size 4096 on Image Net1k, challenging the belief that contrastive models require large batch sizes and many training epochs. [...] 3 EXPERIMENTS [...] 3.1 SYNTHETIC 1D GAUSSIAN [...] 3.2 REAL-WORLD IMAGE REPRESENTATION LEARNING
Researcher Affiliation Collaboration Amitis Shidani Department of Statistics University of Oxford, UK shidani@stats.ox.ac.uk Devon Hjelm, Jason Ramapuram, Russ Webb, Eeshan Gunesh Dhekane, and Dan Busbridge Apple dbusbridge@apple.com
Pseudocode Yes Algorithm 1 Poly-View Contrastive Loss pseudocode. [...] Algorithm 2 Sufficient Statistics Contrastive Loss pseudocode.
Open Source Code No The paper does not contain an explicit statement indicating that the authors are releasing their code for the described methodology, nor does it provide a direct link to a code repository.
Open Datasets Yes We investigate image representation learning on Image Net1k (Russakovsky et al., 2014).
Dataset Splits Yes This dataset is commonly used in computer vision and contains 1.28M training, 50K validation and 100K test images of varying resolutions, each with a label from one of 1000 object classes.
Hardware Specification No The paper mentions training models like 'Res Net 50' and performing experiments, but it does not specify any particular hardware components such as GPU models, CPU types, or memory configurations used for these experiments.
Software Dependencies No The paper mentions 'Py Torch profiler' and 'einops (Rogozhnikov, 2022)' but does not specify version numbers for these software components. It also mentions optimizers like 'Adam W (Loshchilov & Hutter, 2019)' and 'LARS (You et al., 2017)' without providing their versions.
Experiment Setup Yes Table 1: Hyperparameters for all Image Net1k experiments in Section 3.2. Weight initialization kaiming_uniform (He et al., 2015), Backbone normalization Batch Norm, Learning rate schedule Single Cycle Cosine, Learning rate warmup (epochs) 10, Learning rate base value 0.2 4096 / 256 = 3.2, Optimizer LARS (You et al., 2017), Weight decay 1 10 4, Numerical precision bf16, Augmentation stack Sim CLR (Chen et al., 2020a).