Interventional Contrastive Learning with Meta Semantic Regularizer

Authors: Wenwen Qiang, Jiangmeng Li, Changwen Zheng, Bing Su, Hui Xiong

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, our experiments on multiple benchmark datasets demonstrate that ICL-MSR is able to improve the performances of different state-of-the-art CL methods. ...Table 1 shows the experimental results (linear and 5-nn) of the compared methods with a Res Net-18 feature extractor on small and medium size datasets.
Researcher Affiliation Academia 1Science & Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangdong, China. 4Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 5Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China 6Thrust of Artificial Intelligence, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China 7Department of Computer Science Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China.
Pseudocode Yes Appendix: Interventional Contrastive Learning with Meta Semantic Regularizer A. The Training Process Algorithm 1 ICL-MSR
Open Source Code No The paper does not provide any explicit statements or links regarding the open-sourcing of the code for the described methodology.
Open Datasets Yes The following datasets are utilized to evaluate the performance of the proposed ICL-MSR: CIFAR-10 and CIFAR100 (Krizhevsky et al., 2009)... STL-10 (Coates et al., 2011)... Tiny Image Net (Le & Yang, 2015)... Image Net-100 (Tian et al., 2020a)... Image Net (Deng et al., 2009)..." and for toy experiments: "on the COCO dataset (Lin et al., 2014).
Dataset Splits Yes Tiny Image Net (Le & Yang, 2015) can be seen as a simplified version of Image Net, which contains 100K training samples and 10K testing samples from 200 classes and an image scale of 64 × 64. ... Image Net (Deng et al., 2009) is a well-known large-scale dataset. It consists of about 1.3M training images and 50K test images with over 1000 classes. ... We follow the semi-supervised protocol of (Chen et al., 2020a; Chuang et al., 2020) and use the same fixed splits of respectively 1% and 10% of Image Net labeled training dataset.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. It only mentions the neural network architectures (e.g., Res Net-18, Res Net-50) used as feature extractors.
Software Dependencies No The paper mentions 'the Adam optimizer (Kingma & Ba, 2014)' but does not specify version numbers for Adam or any other software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) or CUDA.
Experiment Setup Yes We set τ = 0.5. For BYOL, we use the exponential moving average with cosine increasing, starting from 0.99. For all the compared methods, the Adam optimizer (Kingma & Ba, 2014) is used for the datasets with small and medium sizes. Also, for CIFAR10 and CIFAR-100, the number of epochs is set to 1,000 and the learning rate is set to 3 × 10−3, for Tiny Image Net, Image Net-100, the number of epochs is set to 1,000 and the learning rate is set to 2 × 10−3, for STL-10, the number of epochs is set to 2,000 and the learning rate is set to 2 × 10−3. Also, for all datasets, we use learning rate warm-up for the first 500 iterations of the optimizer, and a 0.2 learning rate drop 50 and 25 epochs before the end. The dimension of output of the projection head fph is set to 1024. The weight decay is set to 10−6. The output dimension of the f is set to 64 for CIFAR-10 and CIFAR-100, 128 for STL-10 and Tiny Image Net. Finally, for Image Net, we set the implementation and hyperparameters to be the same as (Chen et al., 2020b; Chuang et al., 2020).