reproducibilityindex.ai

Optimizing Neural Networks via Koopman Operator Theory

Authors: Akshunna S. Dogra, William Redman

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that Koopman operator theoretic methods allow for accurate predictions of weights and biases of feedforward, fully connected deep networks over a non-trivial range of training time. During this window, we ﬁnd that our approach is >10x faster than various gradient descent based methods (e.g. Adam, Adadelta, Adagrad), in line with our complexity analysis. We then present the results of Koopman training two different feedforward, fully connected, deep NNs: an NN differential equation (DE) solver and a classiﬁer trained on the MNIST data set.
Researcher Affiliation	Academia	Akshunna S. Dogra John A. Paulson School of Engineering and Applied Sciences Harvard University Cambridge, MA 02138, asdpsn@gmail.com William T. Redman Interdeparmental Graduate Program in Dynamical Neuroscience University of California, Santa Barbara Santa Barbara, CA 93106 wredman@ucsb.edu
Pseudocode	Yes	Pseudo-code for how weight/bias data from standard training iteration t1 to t2 is used to approximate the Node Koopman operators, and how these operators are then used to predict the weight evolution from t2 + 1 to t2 + T, is given in Algorithm 1. Algorithm 1 Koopman training via Node Koopman operators
Open Source Code	No	The paper does not provide explicit links or statements about the availability of its source code.
Open Datasets	Yes	We then present the results of Koopman training two different feedforward, fully connected, deep NNs: an NN differential equation (DE) solver and a classiﬁer trained on the MNIST data set. This NN was trained on the MNIST data set via stochastic Adadelta (Fig. 3a see Sec. S4 for more details).
Dataset Splits	No	The NN made a distinction between training and validation loss. However, the paper does not specify exact split percentages or sample counts for the validation dataset.
Hardware Specification	No	The paper mentions 'leveraging the immense parallelization capacities of modern GPUs' but does not specify any particular GPU models, CPU models, or other specific hardware configurations used for experiments.
Software Dependencies	No	The paper mentions using 'the Py Torch environment [37]: Adam [38], Adagrad [39], and Adadelta [40]' but does not provide specific version numbers for PyTorch or the optimizers. Reference [37] cites a NIPS 2019 paper for PyTorch, which implies a version from that year, but a numerical version (e.g., PyTorch 1.9) is not given.
Experiment Setup	Yes	For ease of implementation, we modiﬁed it to have sigmoidal activation functions, a ﬁxed batch (making the optimization non-stochastic), a learning rate with a weak training step dependent decay, and trained with various optimizers via backpropagation (see Sec. S3 for more details). This NN was trained on the MNIST data set via stochastic Adadelta (Fig. 3a see Sec. S4 for more details).