Towards Domain-Agnostic Contrastive Learning

Authors: Vikas Verma, Thang Luong, Kenji Kawaguchi, Hieu Pham, Quoc Le

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the effectiveness of DACL, we conduct experiments across various domains such as tabular data, images, and graphs. Our results show that DACL not only outperforms other domain-agnostic noising methods, such as Gaussian-noise, but also combines well with domain-specific methods, such as Sim CLR, to improve self-supervised visual representation learning.
Researcher Affiliation Collaboration 1Google Research, Brain Team. 2Aalto University, Finland. 3Harvard University. Correspondence to: Vikas Verma <vikas.verma@aalto.fi>, Minh-Thang Luong <thangluong@google.com>, Kenji Kawaguchi <kkawaguchi@fas.harvard.edu>, Hieu Pham <hyhieu@google.com>, Quoc V. Le <qvl@google.com>.
Pseudocode Yes Algorithm 1 Mixup-noise Domain-Agnostic Contrastive Learning.
Open Source Code No The paper does not contain any explicit statements about making the source code available or provide any links to a code repository.
Open Datasets Yes For tabular data experiments, we use Fashion-MNIST and CIFAR-10 datasets..., We use three benchmark image datasets: CIFAR-10, CIFAR-100, and Image Net., We present the results of applying DACL to graph classification problems using six well-known benchmark datasets: MUTAG, PTC-MR, REDDIT-BINARY, REDDIT-MULTI5K, IMDB-BINARY, and IMDB-MULTI (Simonovsky & Komodakis, 2017; Yanardag & Vishwanathan, 2015).
Dataset Splits Yes For all experiments, for pretraining, we train the model for 20 epochs with a batch size of 128, and for linear evaluation, we train the linear classifier on the learned representations for 100 updates with full-batch training. ... We perform linear evaluation using 10-fold cross-validation.
Hardware Specification No The paper describes experimental settings such as batch size, epochs, and optimizers, but does not provide specific details on the hardware used (e.g., GPU models, CPU types, or memory).
Software Dependencies No The paper mentions optimizers used (LARS, Adam) and refers to "Res Net-50(x4)" architecture and "GIN", but it does not specify software versions for libraries like TensorFlow, PyTorch, or the Python interpreter itself.
Experiment Setup Yes For experiments on tabular and image datasets (Section 5.1 and 5.2), we search the hyperparameter α for linear mixing (Section 3 or line 5 in Algorithm 1) from the set {0.5, 0.6, 0.7, 0.8, 0.9}. ...For all experiments, the hyperparameter temperature τ (line 20 in Algorithm 1) is searched from the set {0.1, 0.5, 1.0}.", "All pretraining methods are trained for 1000 epochs with a batch size of 4096. The linear classifier is trained for 200 epochs with a batch size of 256. We use LARS optimizer... The initial learning rate for both pre-training and linear classifier is set to 0.1.