reproducibilityindex.ai

Estimating Information Flow in Deep Neural Networks

Authors: Ziv Goldfeld, Ewout Van Den Berg, Kristjan Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, Yury Polyanskiy

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results verify this connection. In Section 5.1 we experimentally demonstrate that, in some cases, I(X; Tℓ) exhibits compression during training of noisy DNNs. We trained four-layer convolutional neural networks (CNNs) on MNIST (Le Cun et al., 1999). ... We measured their performance on the validation set and characterized the cosine similarities between their internal representations... The experiments demonstrate that I(X; Tℓ) compression in noisy DNNs is driven by clustering of internal representations, and that deterministic DNNs cluster samples as well.
Researcher Affiliation	Collaboration	Ziv Goldfeld 1 2 Ewout van den Berg 2 3 Kristjan Greenewald 2 3 Igor Melnyk 2 3 Nam Nguyen 2 3 Brian Kingsbury 2 3 Yury Polyanskiy 1 2 1Massachusetts Institute of Technology 2MIT-IBM Watson AI Lab 3IBM Research.
Pseudocode	No	No direct match. The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	Code to replicate the experiments in this paper is in preparation, and Goldfeld et al. (2018) will be updated when it is available.
Open Datasets	Yes	we trained four-layer convolutional neural networks (CNNs) on MNIST (Le Cun et al., 1999).
Dataset Splits	No	No direct match. The paper mentions using a "validation set" and reports "MNIST validation errors", but it does not specify the exact percentages or sample counts for training, validation, or test splits. It also does not explicitly reference predefined splits with a citation specifically for the split methodology.
Hardware Specification	No	No direct match. The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	No direct match. The paper does not specify the version numbers for any software dependencies, such as programming languages, libraries (e.g., PyTorch, TensorFlow), or other tools used in their experiments.
Experiment Setup	Yes	The CNNs used different internal noise levels (including β = 0) and one used dropout instead of additive noise. Let σ = tanh, β = 0.01 and X = X 1 X1, with X 1 = { 3, 1, 1} and X1 = {3}, labeled 1 and 1, respectively. We train the neuron using mean squared loss and gradient descent with learning rate 0.01 to illustrate I X; T(k) trends. The FCN was tested with tanh and Re LU nonlinearities as well as a linear model. Fig. 5(a) presents results for the tanh SZT model with β = 0.005 (test classiﬁcation accuracy 97%).