Gradient Regularized Contrastive Learning for Continual Domain Adaptation

Authors: Shixiang Tang, Peng Su, Dapeng Chen, Wanli Ouyang2665-2673

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on Digits, Domain Net and Office-Caltech benchmarks demonstrate the strong performance of our approach when compared to the state-of-the-art.
Researcher Affiliation Collaboration 1 University of Sydney, Australia 2 The Chinese University of Hong Kong, Hong Kong 3 Sensetime Group Limited, Hong Kong
Pseudocode No The paper states 'The training protocol of GRCL is summarized in Supplementary Materials.' but no pseudocode or algorithm block is present in the provided paper text.
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the methodology.
Open Datasets Yes Digits includes five digits datasets (MNIST (Le Cun et al. 1998), MNIST-M (Ganin and Lempitsky 2015), USPS (Hull 1994), Syn Num (Ganin and Lempitsky 2015) and SVHN (Netzer et al. 2011)). Domain Net (Peng et al. 2019a) is one of the largest domain adaptation datasets... Office-Caltech (Gong et al. 2012) includes 10 categories shared by Office-31 (Saenko et al. 2010) and Caltech256 (Griffin, Holub, and Perona 2007) datasets.
Dataset Splits No Each domain has 7, 500 images for training and 1, 500 images for testing. Each domain randomly selects 40, 000 images for training and 8, 000 images for testing. The paper provides specific training and testing splits, but does not explicitly mention or detail a validation dataset split.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models, memory, or specific computing environments used for running experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions).
Experiment Setup Yes For contrastive learning, we set batch size to be 256, feature update momentum to be m = 0.5 in Eq. 3, number of negatives to be 1024 and training schedule to be 240 epochs. The MLP head uses a hidden dimension of 2048. Following (Wu et al. 2018; He et al. 2020), the temperature τ in Eq.4 is 0.07. For data augmentation, we use random color jittering, Gaussian blur and random horizontal flip.