Analysis of Quantized Models
Authors: Lu Hou, Ruiliang Zhang, James T. Kwok
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical experiments confirm the theoretical convergence results, and demonstrate that quantized networks can speed up training and have comparable performance as full-precision networks. |
| Researcher Affiliation | Collaboration | Lu Hou1, Ruiliang Zhang1,2, James T. Kwok1 1Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong {lhouab,jamesk}@cse.ust.hk 2Tu Simple ruiliang.zhang@tusimple.ai |
| Pseudocode | No | The paper contains mathematical derivations and descriptions of methods, but no formally labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that source code is open or publicly available for the described methodology. |
| Open Datasets | Yes | In this experiment, we follow (Wen et al., 2017) and use the same train/test split, data preprocessing, augmentation and distributed Tensorflow setup. ... We train the Alex Net on Image Net. |
| Dataset Splits | Yes | In this experiment, we follow (Wen et al., 2017) and use the same train/test split, data preprocessing, augmentation and distributed Tensorflow setup. |
| Hardware Specification | Yes | Speedup of Image Net training on a 16-node GPU cluster. Each node has 4 1080ti GPUs with one PCI switch. |
| Software Dependencies | No | The paper mentions 'distributed Tensorflow setup' and 'Adam is used as the optimizer' but does not specify version numbers for these software components. |
| Experiment Setup | Yes | The optimizer is RMSProp, and the learning rate is ηt = η/t, where η = 0.03. Training is terminated when the average training loss does not decrease for 5000 iterations. ... The learning rate is decayed from 0.0002 by a factor of 0.1 every 200 epochs. |