Information Flow in Self-Supervised Learning
Authors: Zhiquan Tan, Jingqin Yang, Weiran Huang, Yang Yuan, Yifan Zhang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The empirical evaluations underscore the effectiveness of MMAE compared with the state-of-the-art methods, including a 3.9% improvement in linear probing Vi T-Base, and a 1% improvement in fine-tuning Vi T-Large, both on Image Net. In this section, we empirically evaluate our Matrix Variational Masked Auto-Encoder (M-MAE) with TCR loss, placing special emphasis on its performance in comparison to the U-MAE model with Square uniformity loss as a baseline. |
| Researcher Affiliation | Collaboration | 1Department of Mathematical Sciences, Tsinghua University, Beijing, China 2IIIS, Tsinghua University, Beijing, China 3MIFA Lab, Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University, Shanghai, China 4Shanghai AI Laboratory, Shanghai, China 5Shanghai Qizhi Institute, Shanghai, China. Weiran Huang is supported by the 2023 CCF-Baidu Open Fund and Microsoft Research Asia. |
| Pseudocode | No | No pseudocode or clearly labeled algorithm block was found in the paper. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Datasets: Image Net-1K. We utilize the Image Net-1K dataset (Deng et al., 2009)... The experiments are conducted on CIFAR-10 using Sim CLR. We have conducted experiments on CIFAR-100. |
| Dataset Splits | Yes | Datasets: Image Net-1K. We utilize the Image Net-1K dataset (Deng et al., 2009)... Both models are pre-trained for 200 epochs on Image Net-1K with a batch size of 1024... Evaluation metrics. From Table 1, it s evident that the M-MAE loss outperforms both MAE and U-MAE in terms of linear evaluation and fine-tuning accuracy. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only mentioning general training parameters. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies used in the experiments. |
| Experiment Setup | Yes | For a fair comparison, we adopt UMAE s original hyperparameters: a mask ratio of 0.75 and a uniformity term coefficient λ of 0.01 by default. Both models are pre-trained for 200 epochs on Image Net-1K with a batch size of 1024, and weight decay is similarly configured as 0.05 to ensure parity in the experimental conditions. For Vi T-Base, we set the TCR coefficients µ = 1, and for Vi T-Large, we set µ = 3. |