The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
Authors: Borja Rodrı́guez Gálvez, Arno Blaas, Pau Rodriguez, Adam Golinski, Xavier Suau, Jason Ramapuram, Dan Busbridge, Luca Zappella
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that simply substituting the loss function and instead optimizing ER in Sim CLR (Chen et al., 2020a), BYOL (Grill et al., 2020), and DINO (Caron et al., 2021) results in similar performance while improving resiliency with respect to training with smaller batch sizes or exponential moving average (EMA) coefficients. |
| Researcher Affiliation | Collaboration | 1Division of Information Science and Engineering (ISE), KTH Royal Institute of Technology, Stockholm, Sweden 2Apple. |
| Pseudocode | Yes | Algorithm 1 describes the main algorithm to maximise the ER bound. |
| Open Source Code | Yes | Github repo: apple/ml-entropy-reconstruction. |
| Open Datasets | Yes | For all experiments, we pre-train a resnet50 (He et al., 2016) on the Image Net (Deng et al., 2009) training set. |
| Dataset Splits | Yes | Val Accuracy |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) are mentioned in the paper. It only describes the general experimental setup without hardware specifications. |
| Software Dependencies | No | The paper does not provide specific version numbers for key software components or libraries, such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We train for 400 epochs and following Chen et al. (2020b) we use a batch size of 4096 with the LARS optimizer (You et al., 2017) with linear warmup, a single cycle cosine annealed learning rate schedule, and a base learning rate of 0.3 (Goyal et al., 2017). |