reproducibilityindex.ai

ProxQuant: Quantized Neural Networks via Proximal Operators

Authors: Yu Bai, Yu-Xiang Wang, Edo Liberty

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness and ﬂexibility of PROXQUANT through systematic experiments on (1) image classiﬁcation with Res Nets (Section 4.1); (2) language modeling with LSTMs (Section 4.2). The PROXQUANT method outperforms the state-of-the-art results on binary quantization and is comparable with the state-of-the-art on multi-bit quantization. We perform image classiﬁcation on the CIFAR-10 dataset, which contains 50000 training images and 10000 test images of size 32x32.
Researcher Affiliation	Collaboration	Yu Bai Stanford University yub@stanford.edu Yu-Xiang Wang UC Santa-Barbara yuxiangw@cs.ucsb.edu Edo Liberty Amazon AI libertye@amazon.com
Pseudocode	Yes	Algorithm 1 PROXQUANT: Prox-gradient method for quantized net training
Open Source Code	Yes	Code available at https://github.com/allenbai01/ProxQuant.
Open Datasets	Yes	We perform image classiﬁcation on the CIFAR-10 dataset, which contains 50000 training images and 10000 test images of size 32x32. We perform language modeling with LSTMs Hochreiter & Schmidhuber (1997) on the Penn Treebank (PTB) dataset (Marcus et al., 1993), which contains 929K training tokens, 73K validation tokens, and 82K test tokens.
Dataset Splits	Yes	We perform image classiﬁcation on the CIFAR-10 dataset, which contains 50000 training images and 10000 test images of size 32x32. We perform language modeling with LSTMs Hochreiter & Schmidhuber (1997) on the Penn Treebank (PTB) dataset (Marcus et al., 1993), which contains 929K training tokens, 73K validation tokens, and 82K test tokens.
Hardware Specification	No	The paper does not explicitly state the hardware specifications (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper mentions software like Adam optimizer, but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	We use the homotopy method λt = λ t with λ = 10 4 as the regularization strength and Adam with constant learning rate 0.01 as the optimizer. For Binary Connect, we train with the recommended Adam optimizer with learning rate decay (Courbariaux et al., 2015) (initial learning rate 0.01, multiply by 0.1 at epoch 81 and 122).