State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations

Authors: Alex Lamb, Jonathan Binas, Anirudh Goyal, Sandeep Subramanian, Ioannis Mitliagkas, Yoshua Bengio, Michael Mozer

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of state reification on three classes of problems: sequence classification in a data-limited training environment, generation of long sequences, and adversarial perturbations in image processing.
Researcher Affiliation Collaboration 1Mila 2University of Colorado, Boulder 3CIFAR Senior Fellow 4Google.
Pseudocode No The paper describes algorithms like Denoising Autoencoders and Attractor Networks, but it does not contain any formal pseudocode blocks or sections labeled "Algorithm".
Open Source Code No The paper does not include any statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes We trained a language model on the standard Text8 dataset, which is derived from Wikipedia articles. We constructed FGSM adversarial examples (ε = 0.3) on small MNIST fully-connected networks trained normally. Tables 1 and 2 present results applying state reification on CIFAR10 using non-Res Net and Res Net convolutional nets (CNNs), respectively.
Dataset Splits No The paper describes training and test sets, for example: 'We trained on 256 randomly selected binary sequences. Two distinct test sets were used to evaluate models: one consisted of the held-out 768 binary sequences...'. Similarly for other tasks, it specifies training and test examples (e.g., 'The number of training examples was varied from 50 to 800, always with 2000 test examples.'), but it does not explicitly mention a distinct validation set with specific split percentages or counts.
Hardware Specification No The paper does not provide specific details about the hardware used for the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions types of neural networks like 'tanh and GRU hidden units' or 'single-layer LSTM', but it does not specify any software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes For the parity task: 'We trained the attractor net with σ = .5 and ran it for exactly 15 iterations'. For adversarial training: 'We used an l attack with ε ranging from 0.03 to 0.3 and number of iterations ranging from 7 to 200'. The captions for Tables 1 and 2 also state: 'Both experiments were run for 200 epochs'.