Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Representations by Maximizing Mutual Information Across Views

Authors: Philip Bachman, R Devon Hjelm, William Buchwalter

NeurIPS 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model using standard datasets: CIFAR10, CIFAR100, STL10 [Coates et al., 2011], Image Net1 [Russakovsky et al., 2015], and Places205 [Zhou et al., 2014]. We evaluate performance following the protocol described by Kolesnikov et al. [2019]. Our model outperforms prior work on these datasets.
Researcher Affiliation Collaboration Philip Bachman Microsoft Research EMAIL R Devon Hjelm Microsoft Research, MILA EMAIL William Buchwalter Microsoft Research EMAIL
Pseudocode Yes Figure 1: (c)-top: An algorithm for efficient NCE with minibatches of na images, comprising one antecedent and nc consequents per image. For each true (antecedent, consequent) positive sample pair, we compute the NCE bound using all consequents associated with all other antecedents as negative samples. Our pseudo-code is roughly based on pytorch.
Open Source Code Yes Our code is available online: https://github.com/Philip-Bachman/amdim-public.
Open Datasets Yes We evaluate our model using standard datasets: CIFAR10, CIFAR100, STL10 [Coates et al., 2011], Image Net1 [Russakovsky et al., 2015], and Places205 [Zhou et al., 2014].
Dataset Splits No The paper refers to using 'the training set' and evaluating with linear and MLP classifiers, and states that it follows the evaluation protocol described by Kolesnikov et al. [2019]. However, it does not explicitly provide specific details like exact percentages or sample counts for training, validation, or test splits within its own text.
Hardware Specification Yes We train our models using 4-8 standard Tesla V100 GPUs per model.
Software Dependencies No The paper mentions 'pytorch' in the pseudocode description but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup Yes We use NCE regularization weight λ = 4e 2 for all experiments. [...] We use c = 20 for all experiments. [...] We trained AMDIM models for 150 epochs on 8 NVIDIA Tesla V100 GPUs. [...] On Image Net, using a model with size parameters: (ndf=320, nrkhs=2536, ndepth=10), and a batch size of 1008...