Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder

Authors: Huiwon Jang, Jihoon Tack, Daewon Choi, Jongheon Jeong, Jinwoo Shin

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiment demonstrates the superiority of Meta MAE in the modality-agnostic SSL benchmark (called DABS), significantly outperforming prior baselines. Overall, our experimental results demonstrate strong results, consistently and significantly outperforming previous modalityagnostic SSL methods in linear evaluation. In this section, we demonstrate the effectiveness of the proposed framework by measuring the linearevaluation performance under various datasets across modalities.
Researcher Affiliation Academia Huiwon Jang A Jihoon Tack A Daewon Choi B Jongheon Jeong A Jinwoo Shin A AKorea Advanced Institute of Science and Technology (KAIST) BKorea University {huiwoen0516, jihoontack}@kaist.ac.kr
Pseudocode Yes Our framework is visually depicted in Figure 1, and the pseudo-code is provided in Algorithm 1. Algorithm 1 Meta MAE: Meta-Learning Modality-Agnostic Masked Auto-Encoder
Open Source Code Yes Code is available at https://github.com/alinlab/Meta MAE.
Open Datasets Yes We select 8 sub-benchmarks from the DABS 2.0 benchmark [85], with categorizing the modalities for each sub-benchmark. We pretrain and transfer Meta MAE on the selected datasets: Time-series modality... PAMAP [71] dataset... Tabular modality... HIGGS [69] dataset... Token modality... Genomics [72] dataset... Pfam [20] dataset... Speech modality... Libri Speech [66]... RGB Image modality... Image Net32 [17]... Wafer Map [98]... Vision-Language modality... MSCOCO [54]... VQA [1].
Dataset Splits Yes We pretrain each dataset 100K iterations and 100 epochs transfer learning, overall experiments by following [85]. Note that we use the dataset split described in [83, 85].
Hardware Specification No The paper does not provide specific details about the hardware used for its experiments, such as GPU models, CPU types, or specific cloud instance configurations.
Software Dependencies No The paper mentions using PyTorch and BETTY [13] but does not specify version numbers for these or any other software dependencies needed to replicate the experiments.
Experiment Setup Yes We summarize our selected hyperparameters for pretraining each dataset in Table 8. Following [85], we pretrain Meta MAE for 100k iterations utilizing the Adam W optimizer [56] with both a learning rate and weight decay set at 1e-4. We set the temperature term for the contrastive loss τ = 0.5 and the Nearby-S ratio r = 0.1. In line with [85], we freeze the pretrained model and train either a linear classifier or a regressor for 100 epochs during the linear evaluation phase. We use the Adam optimizer [42] with both the learning rate and weight decay set as 1e-4.