reproducibilityindex.ai

Randomized Automatic Differentiation

Authors: Deniz Oktay, Nick McGreivy, Joshua Aduol, Alex Beatson, Ryan P Adams

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We develop RAD techniques for a variety of simple neural network architectures, and show that for a ﬁxed memory budget, RAD converges in fewer iterations than using a small batch size for feedforward networks, and in a similar number for recurrent networks. We also show that RAD can be applied to scientiﬁc computing, and use it to develop a low-memory stochastic gradient method for optimizing the control parameters of a linear reaction-diffusion PDE representing a ﬁssion reactor.
Researcher Affiliation	Academia	Princeton University Princeton, NJ {doktay,mcgreivy,jaduol,abeatson,rpa}@princeton.edu
Pseudocode	Yes	Algorithm 1 RMAD with path sampling
Open Source Code	Yes	The code is provided on Git Hub1. 1https://github.com/PrincetonLIPS/Randomized Automatic Differentiation
Open Datasets	Yes	We evaluate our proposed RAD method on two feedforward architectures: a small fully connected network trained on MNIST, and a small convolutional network trained on CIFAR-10. We also evaluate our method on an RNN trained on Sequential-MNIST.
Dataset Splits	Yes	We then randomly hold out a validation dataset of size 5000 from the CIFAR-10 and MNIST training sets and train each pair on the reduced training dataset and evaluate on the validation set.
Hardware Specification	Yes	All experiments were run on a single NVIDIA K80 or V100 GPU.
Software Dependencies	No	The paper mentions
Experiment Setup	Yes	Our feedforward network full-memory baseline is trained with a minibatch size of 150... We train with the Adam optimizer... We tune the initial learning rate and ℓ2 weight decay parameters... The learning rate was ﬁxed at 10 4 for all gradient estimators... All recurrent models are trained with SGD without momentum.