reproducibilityindex.ai

Approximation Capabilities of Neural ODEs and Invertible Residual Networks

Authors: Han Zhang, Xi Gao, Jacob Unterman, Tom Arodz

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We tested whether i-Res Net operating in one dimension can learn to perform the x x mapping, and whether adding one more dimension has impact on the ability learn the mapping. To this end, we constructed a network with ﬁve residual blocks. In each block, the residual mapping is a single linear transformation, that is, the residual block is xt+1 = xt + Wxt. We used the ofﬁcial i-Res Net Py Torch package (Behrmann et al., 2019) that relies on spectral normalization (Miyato et al., 2018) to limit the Lipschitz constant to less than unity. We trained the network on a set of 10,000 randomly generated values of x uniformly distributed in [ 10, 10] for 100 epochs, and used an independent test set of 2,000 samples generated similarly. For the one-dimensional x x and the two-dimensional [x, 0] [ x, 0] target mapping, we used MSE as the loss. Adding one extra dimension results in successful learning of the mapping, conﬁrming Theorem 6. The test MSE on each output is below 10 10; the network learned to negate x, and to bring the additional dimension back to null, allowing for invertibility of the model. For the i-Res Net operating in the original, one-dimensional space, learning is not successful (MSE of 33.39), the network learned a mapping x cx for a small positive c, that is, the mapping closest to negation of x that can be achieved while keeping non-intersecting paths, conﬁrming experimentally Corollary 4.
Researcher Affiliation	Academia	1Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA. Correspondence to: Tom Arodz <tarodz@vcu.edu>.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using "the ofﬁcial i-Res Net Py Torch package (Behrmann et al., 2019)" and "torchdiffeq package (Chen et al., 2018)", but does not provide concrete access to its own source code for the methodology described.
Open Datasets	Yes	We used the CIFAR10 dataset (Krizhevsky, 2009) that consists of 32 x 32 RGB images, that is, each input image has dimensionality of p = 32 32 3.
Dataset Splits	Yes	We trained the network on a set of 10,000 randomly generated values of x uniformly distributed in [ 10, 10] for 100 epochs, and used an independent test set of 2,000 samples generated similarly.
Hardware Specification	Yes	We used torchdiffeq package (Chen et al., 2018) and trained on a single NVIDIA Tesla V100 GPU.
Software Dependencies	No	The paper mentions using "the ofﬁcial i-Res Net Py Torch package" and "torchdiffeq package" but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	In each block, the residual mapping is a single linear transformation, that is, the residual block is xt+1 = xt + Wxt. We trained the network on a set of 10,000 randomly generated values of x uniformly distributed in [ 10, 10] for 100 epochs, and used an independent test set of 2,000 samples generated similarly. We used MSE as the loss. In designing the architecture of the neural network underlying the ODE we followed ANODE (Dupont et al., 2019). Brieﬂy, the network is composed of three 2D convolutional layers. The ﬁrst two convolutional layers use k ﬁlters, and the last one uses the number of input channels as the number of ﬁlters, to ensure that the dimensionalities of the input and output of the network match. The convolution stack is followed by a Re LU activation function. A linear layer, with softmax activation and cross-entropy loss, operates on top the ODE block.