reproducibilityindex.ai

Mutual Information Neural Estimation

Authors: Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeshwar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, Devon Hjelm

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4. Empirical comparisons Before diving into applications, we perform some simple empirical evaluation and comparisons of MINE. The objective is to show that MINE is effectively able to estimate mutual information and account for non-linear dependence.
Researcher Affiliation	Academia	1Montr eal Institute for Learning Algorithms (MILA), University of Montr eal 2Department of Mathematics and Statistics, Mc Gill University 3Canadian Institute for Advanced Research (CIFAR) 4The Institute for Data Valorization (IVADO).
Pseudocode	Yes	Algorithm 1 MINE
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the methodology is openly available.
Open Datasets	Yes	Experiment: Stacked MNIST Following Che et al. (2016); Metz et al. (2017); Srivastava et al. (2017); Lin et al. (2017), we quantitatively assess MINE s ability to diminish mode dropping on the stacked MNIST dataset which is constructed by stacking three randomly sampled MNIST digits. We train MINE on datasets of increasing order of complexity: a toy dataset composed of 25-Gaussians, MNIST (Le Cun, 1998), and the Celeb A dataset (Liu et al., 2015).
Dataset Splits	No	The paper mentions using datasets like MNIST and CelebA but does not provide specific train/validation/test split percentages, sample counts, or citations to predefined splits for their experiments. While it mentions a pre-trained classifier on 26,000 samples, this doesn't detail their own model's data partitioning.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory specifications) used for conducting the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details, such as library names with version numbers, used in the experiments.
Experiment Setup	Yes	We demonstrate using Eqn. 17 on the spiral and the 25-Gaussians datasets, comparing two models, one with β = 0 (which corresponds to the orthodox GAN as in Goodfellow et al. (2014)) and one with β = 1.0, which corresponds to mutual information maximization. Since the mutual information is theoretically unbounded, we use adaptive gradient clipping (see the Supplementary Material) to ensure that the generator receives learning signals similar in magnitude from the discriminator and the statistics network.