reproducibilityindex.ai

Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming

Authors: Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin, Aditi Raghunathan, Jonathan Uesato, Rudy R. Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow, Percy S. Liang, Pushmeet Kohli

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For two veriﬁcation-agnostic networks on MNIST and CIFAR-10, we signiﬁcantly improve 8 veriﬁed robust accuracy from 1%Ñ88% and 6%Ñ40% respectively. We also demonstrate tight veriﬁcation of a quadratic stability speciﬁcation for the decoder of a variational autoencoder.
Researcher Affiliation	Collaboration	1Deep Mind 2Google Brain 3Stanford 4UC Berkeley 5Work done at Google
Pseudocode	Yes	Algorithm 1 Veriﬁcation via SDP-FO
Open Source Code	Yes	: Code available at https://github.com/deepmind/jax_verify.
Open Datasets	Yes	For two veriﬁcation-agnostic networks on MNIST and CIFAR-10, we signiﬁcantly improve 8 veriﬁed robust accuracy from 1%Ñ88% and 6%Ñ40% respectively.
Dataset Splits	No	The paper mentions using MNIST and CIFAR-10 datasets, and refers to '500 test set examples' and implies the use of a validation set in 'initial grid search to find a good set of hyperparameters on the validation set'. However, it does not provide explicit details about the specific training/validation/test splits, percentages, or sample counts used for reproduction within the main text or readily accessible appendix sections.
Hardware Specification	Yes	Using a P100 GPU, maximum runtime is roughly 15 minutes per MLP instances, and 3 hours per CNN instances, though most instances are veriﬁed sooner.
Software Dependencies	No	The paper mentions using ML frameworks like Tensor Flow, Py Torch, or JAX, and states that core logic is implemented in JAX. However, it does not provide specific version numbers for these software dependencies (e.g., JAX 0.x.y or PyTorch 1.z).
Experiment Setup	Yes	Complete training and hyperparameter details are included in Appendix B.1. All networks are trained for 50 epochs using Adam with a learning rate of 0.001. We use a batch size of 256 for MNIST and 128 for CIFAR-10.