Hierarchical Quantized Autoencoders

Authors: Will Williams, Sam Ringer, Tom Ash, David MacLeod, Jamie Dougherty, John Hughes

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide qualitative and quantitative evaluations on the Celeb A and MNIST datasets. and 6 Experiments
Researcher Affiliation Industry Will Williams willw@speechmatics.com Sam Ringer samr@speechmatics.com John Hughes johnh@speechmatics.com Tom Ash toma@speechmatics.com David Mac Leod davidma@speechmatics.com Jamie Dougherty jamied@speechmatics.com
Pseudocode Yes Algorithm 1 Lossy Compression Pseudo-code Using A Quantized Hierarchy
Open Source Code Yes Code available at https://github.com/speechmatics/hqa
Open Datasets Yes We provide qualitative and quantitative evaluations on the Celeb A and MNIST datasets. and provides citations [21] and [18] which point to the public datasets. E.g., for Celeb A: Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015. URL http:// mmlab.ie.cuhk.edu.hk/projects/Celeb A.html.
Dataset Splits No The paper mentions evaluating on 'Celeb A test examples' and 'MNIST 10k test samples' but does not provide specific training, validation, and test dataset split percentages or counts for reproducibility beyond the test set size.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions various techniques and frameworks (e.g., VAEs, VQ-VAEs, Gumbel Softmax) but does not provide specific software dependencies with version numbers.
Experiment Setup Yes While training HQA, we linearly decay the Gumbel Softmax temperature to 0 so the soft quantization operation closely resembles hard quantization... and We control for the number of parameters (∼1M) in each system, training each with codebook size 256 and dimension 64.