reproducibilityindex.ai

Conditional Contrastive Learning with Kernel

Authors: Yao-Hung Hubert Tsai, Tianqin Li, Martin Q. Ma, Han Zhao, Kun Zhang, Louis-Philippe Morency, Ruslan Salakhutdinov

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments using weakly supervised, fair, and hard negatives contrastive learning, showing CCL-K outperforms state-of-the-art baselines. We conduct experiments on various conditional contrastive learning frameworks that are discussed in Section 2.2: Section 4.1 for the weakly supervised contrastive learning, Section 4.2 for the fair contrastive learning; and Section 4.3 for the hard-negatives contrastive learning.
Researcher Affiliation	Academia	1Carnegie Mellon University 2University of Illinois at Urbana-Champaign 3Mohamed bin Zayed University of Artiﬁcial Intelligence {yaohungt, tianqinl, qianlim, kunz1, morency, rsalakhu}@cs.cmu.edu {hanzhao}@illinois.edu
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. It describes the methods in text and mathematical formulations.
Open Source Code	Yes	Code available at: https://github.com/Crazy-Jack/CCLK-release.
Open Datasets	Yes	1) UT-Zappos (Yu and Grauman, 2014): It contains 50, 025 shoe images over 21 shoe categories. 2) CUB (Wah et al., 2011): It contains 11, 788 bird images spanning 200 ﬁne-grain bird species, meanwhile 312 binary attributes are attached to each image. 3) Image Net-100 (Russakovsky et al., 2015): It is a subset of Image Net-1k Russakovsky et al. (2015) dataset, containing 0.12 million images spanning 100 categories. 1) CIFAR10 (Krizhevsky et al., 2009): It contains 60, 000 images spanning 10 classes, e.g. automobile, plane, or dog. We synthetically create Color MNIST dataset, which randomly assigns a continuous RBG color value for the background in each handwritten digit image in the MNIST dataset (Le Cun et al., 1998). The dataset is attributed to (Yu and Grauman, 2014) and available at the link: http://vision.cs.utexas.edu/projects/finegrained/utzap50k. CUB-200-2011 is created by Wah et al. (2011) and is a ﬁne-grained dataset for bird species. It can be downloaded from the link: http://www.vision.caltech.edu/ visipedia/CUB-200-2011.html. CIFAR-10 (Krizhevsky et al., 2009) is an object detection dataset with 60, 000 32 32 images in 10 classes. The dataset can be downloaded at https://www.cs.toronto.edu/~kriz/cifar.html. This dataset is a subset of Image Net-1K dataset, which comes from the Image Net Large Scale Visual Recognition Challenge (ILSVRC) 2012-2017 (Russakovsky et al., 2015). ILSVRC is for non-commercial research and educational purposes and we refer to the Image Net ofﬁcial site for more information: https://www.image-net.org/download.php.
Dataset Splits	Yes	We randomly split train-validation images by 7 : 3 ratio, resulting in 35, 017 train data and 15, 008 validation dataset. We follow the original train-validation split, resulting in 5, 994 train images and 5, 794 validation images. We combine the original training and validation set as our training set and use the original test set as our validation set. The resulting training set contains 6, 871 images and the validation set contains 6, 918 images. We use the training and test split from the original dataset. We follow the original MNIST train/test split, resulting in 60,000 training images and 10,000 testing images spanning 10 digit categories. The training split contains 128, 783 images and the test split contains 5, 000 images.
Hardware Specification	Yes	It takes a 4 GPU 1080ti Machine 8 hours to ﬁnish the pretraining. For the second setting where we train with 512 batch size for 1000 epochs, it takes an DGX-1 machine 48 hours to ﬁnish training. We use 128 batch size and train it on 4 1080ti NVIDIA GPUs. All experiments are trained with 200 epochs and require 53 hours of training on DGX machine with 8 Tesla P100 GPUs.
Software Dependencies	No	The paper mentions using LARS optimizer, Limited-memory BFGS (L-BFGS), and Open AI CLIP model, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	In the pre-training stage, on data s training split, we update the parameters in the feature encoder (i.e., gθ ( )s in equation 2) using the contrastive learning objectives e.g., Info NCE (equation 1), Weakly Sup Info NCE (equation 3), or Weakly Sup CCLK (equation 7) . We train 1000 epochs for all experiments with LARS optimizer (base learning rate 1.5 and scale learning rate based on our batch size divided by 256) with batch size 152 on 4 NVIDIA 1080ti GPUs. All experiments are run with 1000 pretraining iterations and 500 L-BFGS ﬁne tuning steps. We use 128 batch size. The ﬁrst setting, which is reported in the main text, trains contrastive learning with 256 batch size for 400 epochs. For the second setting where we train with 512 batch size for 1000 epochs. We use LARS optimizer for all CCL-K related experiments with base lr=1.5 and base batch size equals 256. All experiments are trained with 200 epochs.