Wavelet Flow: Fast Training of High Resolution Normalizing Flows
Authors: Jason J. Yu, Konstantinos G. Derpanis, Marcus A. Brubaker
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3 Experimental Evaluation |
| Researcher Affiliation | Collaboration | Jason J. Yu1,3, Konstantinos G. Derpanis2,4,5 and Marcus A. Brubaker1,3,5 1Department of Electrical Engineering and Computer Science, York University, Toronto 2Department of Computer Science, Ryerson University, Toronto 3Borealis AI, 4Samsung AI Centre Toronto, 5Vector Institute |
| Pseudocode | No | The paper describes methods and processes verbally and mathematically (e.g., equations 1 and 2, and descriptions of Wavelet Flow architecture and sampling), but does not include any explicitly labeled 'Algorithm' or 'Pseudocode' blocks. |
| Open Source Code | Yes | Code for Wavelet Flow is available at the following project page: https://yorkucvil.github.io/Wavelet-Flow. |
| Open Datasets | Yes | To evaluate the performance of Wavelet Flow, we use several standard image datasets to directly compare against the reported results of previous methods. Specifically, we train and evaluate our model on natural image datasets at the commonly used resolutions and follow standard preprocessing: Image Net [38] (32 32 and 64 64) and Large-scale Scene Understanding (LSUN) bedroom, tower, and church outdoor [46] (64 64). We also train on two high resolution datasets at resolutions not previously reported: Celeb Faces Attributes High-Quality (Celeb A-HQ) [21] (1024 1024) and Flickr-Faces-HQ (FFHQ) [22] (1024 1024). |
| Dataset Splits | Yes | In cases where overfitting is observed, early stopping is applied based on a held-out validation set. As no standard dataset split is available, we generate our own with 59 000, 4 000, and 7 000 images for training, validation and testing, respectively. |
| Hardware Specification | Yes | These implementation choices allow for distributions to be trained using a batch-size of 64 without gradient checkpointing on a single NVIDIA TITAN X (Pascal) GPU. |
| Software Dependencies | No | The paper mentions using 'Adamax optimizer [23]' and 'Glow architecture [24]' as components. However, it does not specify software dependencies with version numbers for core libraries or environments (e.g., Python version, specific deep learning frameworks like PyTorch or TensorFlow and their versions, or CUDA versions). |
| Experiment Setup | Yes | Training is done using the same Adamax optimizer [23] as in [24]. These implementation choices allow for distributions to be trained using a batch-size of 64 without gradient checkpointing on a single NVIDIA TITAN X (Pascal) GPU. Hyper-parameters are set to produce models with parameter counts similar to but not exceeding those of Glow [24] to enable a fair comparison. |