reproducibilityindex.ai

Phase Transitions for the Information Bottleneck in Representation Learning

Authors: Tailin Wu, Ian Fischer

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We quantitatively and qualitatively test the ability of our theory and Algorithm 1 to provide good predictions for IB phase transitions. We ﬁrst verify them in fully categorical settings, where X, Y, Z are all discrete, and we show that the phase transitions can correspond to learning new classes as we increase β. We then test our algorithm on versions of the MNIST and CIFAR10 datasets with added label noise.
Researcher Affiliation	Collaboration	Tailin Wu Stanford tailin@cs.stanford.edu Ian Fischer Google Research iansf@google.com
Pseudocode	Yes	Algorithm 1 Phase transitions discovery for IB
Open Source Code	No	No explicit statement about releasing the source code for the methodology described in this paper, nor a link to a code repository, was found.
Open Datasets	Yes	CIFAR10 dataset (Krizhevsky & Hinton, 2009)
Dataset Splits	No	The paper mentions using MNIST training examples and CIFAR10 dataset but does not specify the train/validation/test splits, percentages, or absolute sample counts for reproducibility.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., CPU, GPU models, or memory) used to run its experiments.
Software Dependencies	No	The paper mentions using Adam optimizer and a Wide ResNet implementation, but does not specify version numbers for these or any other software dependencies, which are necessary for reproducibility.
Experiment Setup	Yes	For MNIST: The encoder is a three-layer neural net, where each hidden layer has 512 neurons and leaky Re LU activation, and the last layer has linear activation. The classiﬁer p(y\|z) is a 2-layer neural net with a 128-neuron Re LU hidden layer. The backward encoder p(z\|y) is also a 2-layer neural net with a 128-neuron Re LU hidden layer. We trained with Adam (Kingma & Welling, 2013) at learning rate of 10-3, and anneal down with factor 1/(1+0.01 epoch). For Alg. 1, for the fθ we use the same architecture as the encoder of CEB, and use \|Z\| = 50 in Alg. 1. For CIFAR10: We trained 28 1 Wide Res Net models... Samples from the encoder were passed to the classiﬁer, a 2 layer MLP. ... β from 1.0 to 6.0 with step size of 0.02. ... annealing β from 100 down to the target β over 600 epochs, and continue to train at the target epoch for another 800 epochs. ... base learning rate of 10-3, and reduced the learning rate by a factor of 0.5 at 300, 400, and 500 epochs. ... \|Z\| = 50 in Alg. 1.