reproducibilityindex.ai

BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer

Authors: Haoping Bai, Meng Cao, Ping Huang, Jiulong Shan

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of our method on Image Net and achieve SOTA Top-1 accuracy under a low complexity constraint (< 20 MFLOPs). 5 Experimental Analysis and Results
Researcher Affiliation	Industry	Haoping Bai Meng Cao Ping Huang Jiulong Shan {haoping_bai, mengcao, huang_ping, jiulong_shan}@apple.com
Pseudocode	No	The paper describes methods and formulas but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code and models will be made publicly available at https://github.com/bhpfelix/QFA.
Open Datasets	Yes	We demonstrate the effectiveness of our method on Image Net and achieve SOTA Top-1 accuracy under a low complexity constraint (< 20 MFLOPs). ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248 255, 2009.
Dataset Splits	Yes	After training is complete, we randomly sample 16k quantized subnets and evaluate on 10k validation images sampled from the training set to train the accuracy predictor.
Hardware Specification	Yes	We train with a batch size of 2048 across 32 V100 GPUs on our internal cluster.
Software Dependencies	No	The paper mentions basing its codebase on an open-source implementation but does not provide specific version numbers for software dependencies like PyTorch, TensorFlow, or CUDA.
Experiment Setup	Yes	For both stages of the elastic quantization procedure, we follow the common hyperparameter choice of [22] and use an initial learning rate of 0.08. For all experiments, we clip the global norm of the gradient at 500. We train with a batch size of 2048 across 32 V100 GPUs on our internal cluster. ... During the evolutionary search, we keep a population size of 500 for 1000 generations. For each generation, once we identify the Pareto population based on nondominated sorting and crowding distance, we breed new genotypes through crossover and mutation with a crossover probability of 0.007 and a mutation probability of 0.02.