Contrastive Learning with Hard Negative Samples

Authors: Joshua David Robinson, Ching-Yao Chuang, Suvrit Sra, Stefanie Jegelka

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, our hard negative sampling strategy improves the downstream task performance for image, graph and text data.
Researcher Affiliation Academia Massachusetts Institute of Technology Cambridge, MA, USA {joshrob,cychuang,suvrit,stefje}@mit.edu
Pseudocode Yes Py Torch-style pseudocode for the objective is given in Fig. 13 in Appendix D.
Open Source Code Yes Code available at: https://github.com/joshr17/HCL
Open Datasets Yes We begin by testing the hard sampling method on vision tasks using the STL10, CIFAR100 and CIFAR10 data.
Dataset Splits Yes Each embedding is evaluated using the average accuracy 10-fold cross-validation using an SVM as the classifier
Hardware Specification No No specific hardware details (like GPU or CPU models, or memory specifications) were provided for the experimental setup.
Software Dependencies No The paper mentions software like Py Torch and the Adam optimizer, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For all experiments β is treated as a hyper-parameter (see ablations in Fig. 2 for more understanding of how to pick β). Values for M and τ + must also be determined. We fix M = 1 for all experiments...all models are trained for 400 epochs. We use the Adam optimizer (Kingma & Ba, 2015) with learning rate 0.001 and weight decay 10-6. Each model is trained for 200 epochs, with batch size 128 using the Adam optimizer (Kingma & Ba, 2015). with learning rate 0.001, and weight decay of 10-6.