reproducibilityindex.ai

Deep Variational Information Bottleneck

Authors: Alexander A. Alemi, Ian Fischer, Joshua V. Dillon, Kevin Murphy

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present various experimental results, comparing the behavior of standard deterministic networks to stochastic neural networks trained by optimizing the VIB objective. We show that models trained with the VIB objective outperform those that are trained with other forms of regularization, in terms of generalization performance and robustness to adversarial attack.
Researcher Affiliation	Industry	Alexander A. Alemi, Ian Fischer, Joshua V. Dillon, Kevin Murphy Google Research {alemi,iansf,jvdillon,kpmurphy}@google.com
Pseudocode	No	The paper provides mathematical derivations and descriptions of the proposed method, but it does not include any formally presented pseudocode or algorithm blocks (e.g., in a labeled figure or environment).
Open Source Code	No	The paper mentions that 'Carlini & Wagner (2016) shared their code with us' and refers to 'publicly available, pretrained checkpoints' of other models, but it does not explicitly state that the authors' own source code for the VIB method is publicly available or provided.
Open Datasets	Yes	We start with experiments on unmodiﬁed MNIST (i.e. no data augmentation). We make use of publicly available, pretrained checkpoints of Inception Resnet V2 (Szegedy et al., 2016) on Image Net (Deng et al., 2009).
Dataset Splits	Yes	For the MNIST experiments, a batch size of 100 was used, and the full 60,000 training and validation set was used for training, and the 10,000 test images for test results.
Hardware Specification	No	The paper mentions training on TensorFlow, which implies computational resources, but it does not specify any particular hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies	No	The paper states 'All of the networks for this paper were trained using Tensor Flow (Abadi et al., 2016)' and 'The Adam optimizer (Kingma & Ba, 2015) was used'. While these are software components, specific version numbers for TensorFlow or the Adam optimizer are not provided.
Experiment Setup	Yes	The networks were trained using Tensor Flow for 200 epochs using the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.0001. Full hyperparameter details can be found in Appendix A. Appendix A details: initial learning rate of 10 4, (β1 = 0.5, β2 = 0.999) and exponential decay, decaying the learning rate by a factor of 0.97 every 2 epochs. The networks were all trained for 200 epochs total. For the MNIST experiments, a batch size of 100 was used... The input images were scaled to have values between -1 and 1 before fed to the network.