reproducibilityindex.ai

Learning Representations by Maximizing Mutual Information Across Views

Authors: Philip Bachman, R Devon Hjelm, William Buchwalter

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our model using standard datasets: CIFAR10, CIFAR100, STL10 [Coates et al., 2011], Image Net1 [Russakovsky et al., 2015], and Places205 [Zhou et al., 2014]. We evaluate performance following the protocol described by Kolesnikov et al. [2019]. Our model outperforms prior work on these datasets.
Researcher Affiliation	Collaboration	Philip Bachman Microsoft Research phil.bachman@gmail.com R Devon Hjelm Microsoft Research, MILA devon.hjelm@microsoft.com William Buchwalter Microsoft Research wibuch@microsoft.com
Pseudocode	Yes	Figure 1: (c)-top: An algorithm for efﬁcient NCE with minibatches of na images, comprising one antecedent and nc consequents per image. For each true (antecedent, consequent) positive sample pair, we compute the NCE bound using all consequents associated with all other antecedents as negative samples. Our pseudo-code is roughly based on pytorch.
Open Source Code	Yes	Our code is available online: https://github.com/Philip-Bachman/amdim-public.
Open Datasets	Yes	We evaluate our model using standard datasets: CIFAR10, CIFAR100, STL10 [Coates et al., 2011], Image Net1 [Russakovsky et al., 2015], and Places205 [Zhou et al., 2014].
Dataset Splits	No	The paper refers to using 'the training set' and evaluating with linear and MLP classifiers, and states that it follows the evaluation protocol described by Kolesnikov et al. [2019]. However, it does not explicitly provide specific details like exact percentages or sample counts for training, validation, or test splits within its own text.
Hardware Specification	Yes	We train our models using 4-8 standard Tesla V100 GPUs per model.
Software Dependencies	No	The paper mentions 'pytorch' in the pseudocode description but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup	Yes	We use NCE regularization weight λ = 4e 2 for all experiments. [...] We use c = 20 for all experiments. [...] We trained AMDIM models for 150 epochs on 8 NVIDIA Tesla V100 GPUs. [...] On Image Net, using a model with size parameters: (ndf=320, nrkhs=2536, ndepth=10), and a batch size of 1008...