reproducibilityindex.ai

EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence

Authors: Chung-Yiu Yau, Hoi To Wai, Parameswaran Raman, Soumajyoti Sarkar, Mingyi Hong

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments validate that EMC2 is effective with small batch training and achieves comparable or better performance than baseline algorithms. We report the results for pre-training image encoders on STL-10 and Imagenet-100.
Researcher Affiliation	Collaboration	1Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong SAR of China. The work of C.-Y. Yau was done while interning at Amazon Web Services. 2Amazon Web Services, USA. 3Department of Electrical and Computer Engineering, University of Minnesota, USA. M. Hong holds concurrent appointments as an Amazon Scholar and as a faculty at University of Minnesota. This paper describes his work performed at Amazon. Correspondence to: Chung-Yiu Yau <cyyau@se.cuhk.edu.hk>.
Pseudocode	Yes	Algorithm 1 Efficient MCMC Negative Sampling Method for Contrastive Learning (EMC2)
Open Source Code	Yes	The code used in the experiments are available at https://github.com/amazon-science/ contrastive_emc2.
Open Datasets	Yes	We concentrate on two common datasets under this setup STL-10 and Imagenet-100. Table 2: Datasets attributes includes STL-10 and Imagenet-100.
Dataset Splits	No	The paper discusses metrics like 'linear probe (LP) accuracy' and '1-nearest-neighbor (1-NN) accuracy' and mentions 'test accuracy' in figures, but does not explicitly provide details on train/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification	Yes	Note that in this setup, negative cache algorithm uses four Tesla T4 GPUs for training and refreshing the negative cache while the other algorithms run on one Tesla T4 GPU.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and 'LARS optimizer' but does not specify software versions (e.g., 'PyTorch 1.9', 'TensorFlow 2.5').
Experiment Setup	Yes	In Table 3, we list the hyperparameter values adopted in our experiments. Table 3: Dataset, Model, Inverse Temp. β, Batch Size b, Learning Rate γ, Feature Dim. d, Weight Decay, Cache Refresh ρ (Negative Cache), Burn-in Steps P (EMC2)