Contrastive Learning with Hard Negative Samples
Authors: Joshua David Robinson, Ching-Yao Chuang, Suvrit Sra, Stefanie Jegelka
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our hard negative sampling strategy improves the downstream task performance for image, graph and text data. |
| Researcher Affiliation | Academia | Massachusetts Institute of Technology Cambridge, MA, USA {joshrob,cychuang,suvrit,stefje}@mit.edu |
| Pseudocode | Yes | Py Torch-style pseudocode for the objective is given in Fig. 13 in Appendix D. |
| Open Source Code | Yes | Code available at: https://github.com/joshr17/HCL |
| Open Datasets | Yes | We begin by testing the hard sampling method on vision tasks using the STL10, CIFAR100 and CIFAR10 data. |
| Dataset Splits | Yes | Each embedding is evaluated using the average accuracy 10-fold cross-validation using an SVM as the classifier |
| Hardware Specification | No | No specific hardware details (like GPU or CPU models, or memory specifications) were provided for the experimental setup. |
| Software Dependencies | No | The paper mentions software like Py Torch and the Adam optimizer, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For all experiments β is treated as a hyper-parameter (see ablations in Fig. 2 for more understanding of how to pick β). Values for M and τ + must also be determined. We fix M = 1 for all experiments...all models are trained for 400 epochs. We use the Adam optimizer (Kingma & Ba, 2015) with learning rate 0.001 and weight decay 10-6. Each model is trained for 200 epochs, with batch size 128 using the Adam optimizer (Kingma & Ba, 2015). with learning rate 0.001, and weight decay of 10-6. |