reproducibilityindex.ai

Invertible Residual Networks

Authors: Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud, Joern-Henrik Jacobsen

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation shows that invertible Res Nets perform competitively with both state-of-the-art image classiﬁers and ﬂow-based generative models, something that has not been previously achieved with a single architecture.
Researcher Affiliation	Academia	1University of Bremen, Center for Industrial Mathematics 2Vector Institute and University of Toronto.
Pseudocode	Yes	Algorithm 1. Inverse of i-Res Net layer via ﬁxed-point iteration. ... Algorithm 2. Forward pass of an invertible Res Nets with Lipschitz constraint and log-determinant approximation, SN denotes spectral normalization based on (2).
Open Source Code	Yes	1Ofﬁcial code release: https://github.com/jhjacobsen/invertible-resnet
Open Datasets	Yes	To compare the discriminative performance and invertibility of i-Res Nets with standard Res Net architectures, we train both models on CIFAR10, CIFAR100, and MNIST.
Dataset Splits	No	The paper mentions using CIFAR10, CIFAR100, and MNIST datasets but does not explicitly provide specific training/validation/test split percentages, sample counts, or refer to a cited standard split configuration for reproduction.
Hardware Specification	Yes	The runtime on 4 Ge Force GTX 1080 GPUs with 1 spectral norm iteration was 0.5 sec for a forward and backward pass of batch with 128 samples, while it took 0.2 sec without spectral normalization.
Software Dependencies	No	The paper mentions software components and methods like 'SGD with momentum', 'Adam or Adamax', 'ELU', 'softplus', and 'ReLU', but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	The CIFAR and MNIST models have 54 and 21 residual blocks, respectively and we use identical settings for all other hyperparameters. ... We are able to train i-Res Nets using SGD with momentum and a learning rate of 0.1 whereas all version of Glow we tested needed Adam or Adamax (Kingma & Ba, 2014) and much smaller learning rates to avoid divergence. ... To obtain the numerical inverse, we apply 100 ﬁxed point iterations (Equation (1)) for each block. ... Compared to the classiﬁcation model, the log-determinant approximation with 5 series terms roughly increased the computation times by a factor of 4.