The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

Authors: Borja Rodrı́guez Gálvez, Arno Blaas, Pau Rodriguez, Adam Golinski, Xavier Suau, Jason Ramapuram, Dan Busbridge, Luca Zappella

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show that simply substituting the loss function and instead optimizing ER in Sim CLR (Chen et al., 2020a), BYOL (Grill et al., 2020), and DINO (Caron et al., 2021) results in similar performance while improving resiliency with respect to training with smaller batch sizes or exponential moving average (EMA) coefficients.
Researcher Affiliation Collaboration 1Division of Information Science and Engineering (ISE), KTH Royal Institute of Technology, Stockholm, Sweden 2Apple.
Pseudocode Yes Algorithm 1 describes the main algorithm to maximise the ER bound.
Open Source Code Yes Github repo: apple/ml-entropy-reconstruction.
Open Datasets Yes For all experiments, we pre-train a resnet50 (He et al., 2016) on the Image Net (Deng et al., 2009) training set.
Dataset Splits Yes Val Accuracy
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts) are mentioned in the paper. It only describes the general experimental setup without hardware specifications.
Software Dependencies No The paper does not provide specific version numbers for key software components or libraries, such as Python, PyTorch, or CUDA.
Experiment Setup Yes We train for 400 epochs and following Chen et al. (2020b) we use a batch size of 4096 with the LARS optimizer (You et al., 2017) with linear warmup, a single cycle cosine annealed learning rate schedule, and a base learning rate of 0.3 (Goyal et al., 2017).