The Information Sieve

Authors: Greg Ver Steeg, Aram Galstyan

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a practical implementation of this framework for discrete variables and apply it to a variety of fundamental tasks in unsupervised learning including independent component analysis, lossy and lossless compression, and predicting missing values in data.
Researcher Affiliation Academia Greg Ver Steeg GREGV@ISI.EDU University of Southern California, Information Sciences Institute, Marina del Rey, CA 90292 USA Aram Galstyan GALSTYAN@ISI.EDU University of Southern California, Information Sciences Institute, Marina del Rey, CA 90292 USA
Pseudocode No The paper describes algorithmic steps in prose and references appendices for constructions, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Code implementing this entire pipeline is available (Ver Steeg). <...in references...> Ver Steeg, Greg. Open source project implementing the discrete information sieve. http://github.com/ gregversteeg/discrete_sieve.
Open Datasets Yes For the following tasks, we consider 50k MNIST digits that were binarized at the normalized grayscale threshold of 0.5.
Dataset Splits No The paper states: 'We use 50k digits as training for models, and report compression results on the 10k test digits.' This describes a train/test split, but no explicit validation set or specific train/validation/test percentages are provided.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact CPU/GPU models, memory specifications) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names like TensorFlow or PyTorch, along with their versions).
Experiment Setup Yes The 28 × 28 binarized images are treated as binary vectors in a 784 dimensional space. The digit labels are also not used in our analysis. We trained the information sieve on this data, adding layers as long as the bounds were tightening. This led to a 12 layer representation and a lower bound on TC(X) of about 40 bits.