reproducibilityindex.ai

Learning and Memorization

Authors: Satrajit Chatterjee

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiment 1. In the ﬁrst experiment, we apply the above procedure to the Binary-MNIST task (as deﬁned in Section 3) to see if this approach to memorization can generalize. For this experiment, we construct a network with 5 hidden layers of 1024 luts and 1 lut in the output layer. We set k = 8, i.e., each lut in the network takes 8 inputs. The network achieves a training accuracy of 0.89 on this task, which is perhaps not so surprising since we are memorizing the training data after all. But what is surprising is that the network achieves an accuracy of 0.87 on a heldout set (the 10,000 test images in MNIST) which indicates generalization.
Researcher Affiliation	Industry	Satrajit Chatterjee 1 Two Sigma, New York, NY, USA. Correspondence to: Satrajit Chatterjee <satrajit.chatterjee@twosigma.com>.
Pseudocode	No	The paper describes the learning procedure in textual paragraphs (e.g., in Section 2 and 3) but does not include any formally structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository or mention code in supplementary materials.
Open Datasets	Yes	Now consider a binary classiﬁcation task on MNIST (Le Cun & Cortes, 2010) of separating the digits 0 through 4 (we map these to the 0 class) from the digits 5 through 9 (the 1 class) where the pixels are 1-bit quantized. Thus the task is to learn a function f : B28 28 B. We call this the Binary-MNIST task (overloading binary here to mean both binary classiﬁcation and binary inputs). ... Experiment 7. Next we look at memorization on CIFAR-10 which is a collection of 32 pixel by 32 pixel color images belonging to 10 classes.
Dataset Splits	No	The paper mentions training data and a 'held-out set (the 10,000 test images in MNIST)' which functions as a test set. It does not explicitly define a separate validation set or its split, nor does it specify exact training/validation/test percentages or sample counts beyond the general description of MNIST test images.
Hardware Specification	No	The paper mentions that 'it typically takes less than 30 seconds using a single threaded unoptimized implementation (Python with Num Py) to run an experiment,' implying the use of a CPU, but it does not specify any exact CPU model, GPU, or other hardware details.
Software Dependencies	No	The paper mentions 'Python with Num Py' as the implementation environment but does not provide specific version numbers for either Python or NumPy.
Experiment Setup	Yes	For this experiment, we construct a network with 5 hidden layers of 1024 luts and 1 lut in the output layer. We set k = 8, i.e., each lut in the network takes 8 inputs.