Invertible DenseNets with Concatenated LipSwish

Authors: Yura Perugachi-Diaz, Jakub Tomczak, Sandjai Bhulai

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To make a clear comparison between the performance of Residual Flows and i-Dense Nets, we train both models on 2-dimensional toy data and high-dimensional image data: CIFAR10 [22] and Image Net32 [5]. The results of the learned density distributions are presented in Figure 3. We observe that Residual Flows are capable to capture highprobability areas. We measure performances in bits per dimension (bpd). The results can be found in Table 3.
Researcher Affiliation Academia Yura Perugachi-Diaz Vrije Universiteit Amsterdam y.m.perugachidiaz@vu.nl Jakub M. Tomczak Vrije Universiteit Amsterdam j.m.tomczak@vu.nl Sandjai Bhulai Vrije Universiteit Amsterdam s.bhulai@vu.nl
Pseudocode No No pseudocode or clearly labeled algorithm block was found in the paper.
Open Source Code Yes The code can be retrieved from: https://github.com/yperugachidiaz/ invertible_densenets.
Open Datasets Yes To make a clear comparison between the performance of Residual Flows and i-Dense Nets, we train both models on 2-dimensional toy data and high-dimensional image data: CIFAR10 [22] and Image Net32 [5].
Dataset Splits No The paper does not explicitly provide details about training/validation/test splits with specific percentages, sample counts, or a detailed methodology for splitting beyond mentioning a test set.
Hardware Specification No To speed up training, we use 4 GPUs.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) were mentioned in the paper's text.
Experiment Setup Yes We train both models for 50,000 iterations and, at the end of the training, we visualize the learned distributions. For density estimation, we run the full model with the best settings for 1,000 epochs on CIFAR10 and 20 epochs on Image Net32 where we use single-seed results following [2, 4, 20], due to little fluctuations in performance. We run the three models for 400 epochs and note that the model with λ = 1 was not fully converged in both accuracy and bits per dimension after training.