Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Optimal Binary Autoencoding with Pairwise Correlations
Authors: Akshay Balsubramani
ICLR 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments in Section 6 show extremely competitive results with equivalent fully-connected autoencoders trained with backpropagation. The datasets we use are ๏ฌrst normalized to [0, 1], and then binarized by sampling each pixel stochastically in proportion to its intensity, following prior work (Salakhutdinov & Murray (2008)). |
| Researcher Affiliation | Academia | Akshay Balsubramani Stanford University EMAIL |
| Pseudocode | Yes | Algorithm 1 Pairwise Correlation Autoencoder (PC-AE) Input: Size-n dataset หX, number of epochs T |
| Open Source Code | Yes | TensorFlow code available at https://github.com/aikanor/pc-autoencoder . |
| Open Datasets | Yes | The datasets we use are ๏ฌrst normalized to [0, 1], and then binarized by sampling each pixel stochastically in proportion to its intensity, following prior work (Salakhutdinov & Murray (2008)). We use the preprocessed version of the Omniglot dataset found in Burda et al. (2016), split 1 of the Caltech-101 Silhouettes dataset, the small not MNIST dataset, and the UCI Adult (a1a) dataset. |
| Dataset Splits | No | The paper mentions 'early stopping on the test set' and '10-fold cross-validation' for 'not MNIST', but it does not explicitly describe a separate validation dataset split with specific percentages or counts for all experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as CPU/GPU models, memory, or cloud instance types. |
| Software Dependencies | No | The paper mentions 'TensorFlow code available at https://github.com/aikanor/pc-autoencoder', and the use of 'Adagrad (Duchi et al. (2011))' and the 'Adam method with default parameters (Kingma & Ba (2014))' for optimization, but it does not specify version numbers for these software components or libraries. |
| Experiment Setup | Yes | We used minibatches of size 250. All standard autoencoders use the Xavier initialization and trained for 500 epochs or using early stopping on the test set. We compare to a basic AE with a single hidden layer, trained using the Adam method with default parameters (Kingma & Ba (2014)). |