reproducibilityindex.ai

Variational Network Quantization

Authors: Jan Achterhold, Jan Mathias Koehler, Anke Schmeink, Tim Genewein

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Results are shown for ternary quantization on Le Net-5 (MNIST) and Dense Net (CIFAR-10). In our experiments, we train with VNQ and then first prune via thresholding log αij log Tα = 2. We demonstrate our method with Le Net-54 (Le Cun et al., 1998) on the MNIST handwritten digits dataset. Our second experiment uses a modern Dense Net (Huang et al., 2017) (k = 12, depth L = 76, with bottlenecks) on CIFAR-10 (Krizhevsky & Hinton, 2009).
Researcher Affiliation	Collaboration	Jan Achterhold1,2, Jan M. K ohler1, Anke Schmeink2 & Tim Genewein1,* 1Bosch Center for Artiﬁcial Intelligence Robert Bosch Gmb H Renningen, Germany 2RWTH Aachen University Institute for Theoretical Information Technology Aachen, Germany
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement about releasing source code or include a link to a code repository for the described methodology.
Open Datasets	Yes	We demonstrate our method with Le Net-54 (Le Cun et al., 1998) on the MNIST handwritten digits dataset. Our second experiment uses a modern Dense Net (Huang et al., 2017) (k = 12, depth L = 76, with bottlenecks) on CIFAR-10 (Krizhevsky & Hinton, 2009).
Dataset Splits	No	The paper mentions 'validation accuracy' and 'validation error' in its results and during the training process, for example, 'validation accuracy of 99.2%'. However, it does not explicitly provide the specific dataset split percentages or sample counts used for the validation set, nor does it cite a predefined validation split.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models.
Software Dependencies	No	The paper mentions software like 'Caffe' and 'Adam optimizer' but does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	We initialize means θ with the pre-trained weights and variances with log σ2 = 8. The warm-up factor β is linearly increased from 0 to 1 during the first 15 epochs. VNQ training runs for a total of 195 epochs with a batch-size of 128, the learning rate is linearly decreased from 0.001 to 0 and the learning rate for adjusting the codebook parameter a uses a learning rate that is 100 times lower. For DenseNet: we use a batch-size of 64 samples, the warm-up weight β of the KL term is 0 for the first 5 epochs and is then linearly ramped up from 0 to 1 over the next 15 epochs, the learning rate of 0.005 is kept constant for the first 50 epochs and then linearly decreased to a value of 0.003 when training stops after 150 epochs.