reproducibilityindex.ai

Verification of Non-Linear Specifications for Neural Networks

Authors: Chongli Qin, Krishnamurthy (Dj) Dvijotham, Brendan O'Donoghue, Rudy Bunel, Robert Stanforth, Sven Gowal, Jonathan Uesato, Grzegorz Swirszcz, Pushmeet Kohli

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental evaluation shows that our method is able to effectively verify these speciﬁcations. Moreover, our evaluation exposes the failure modes in models which cannot be veriﬁed to satisfy these speciﬁcations. Thus, emphasizing the importance of training models not just to ﬁt training data but also to be consistent with speciﬁcations. 4 EXPERIMENTS We have proposed a novel set of speciﬁcations to be veriﬁed as well as new veriﬁcation algorithms that can verify whether these speciﬁcations are satisﬁed by neural networks. In order to validate our contributions experimentally, we perform two sets of experiments:
Researcher Affiliation	Industry	Deep Mind London, N1C 4AG, UK correspondence: chongliqin@google.com
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets	Yes	We study the semantic distance speciﬁcation (1) in the context of the CIFAR-10 dataset. We deﬁne the distances d(i, j) between labels as their distance according to Wordnet (Miller, 1995) (the full distance matrix used is shown in Appendix F). [...] MNIST and CIFAR-10: For these datasets both models was trained to maximize the log likelihood of true label predictions while being robust to adversarial examples via the method described in (Wong & Kolter, 2018). The training for model A places a heavier loss than model B when robustness measures are violated. Mujoco: To test the energy speciﬁcation, we used the Mujoco physics engine (Todorov et al., 2012) to create a simulation of a simple pendulum with damping friction.
Dataset Splits	No	The paper mentions a test set for the Mujoco dataset ('27000 was set aside as test set'), but it does not specify a validation set split or other dataset split details for any of the datasets used.
Hardware Specification	No	The paper mentions 'our desktop machine (with 1 GPU and 8G of memory)' but does not provide specific GPU or CPU models, or further detailed hardware specifications used for running experiments.
Software Dependencies	No	The paper mentions software tools like 'GLOP as the LP solver' and 'CVXPY' but does not specify their version numbers, which are necessary for reproducible descriptions of software dependencies.
Experiment Setup	Yes	4.2 EXPERIMENTAL SETUP For each of the speciﬁcations we trained two networks (referred to as model A and B in the following) that satisfy our speciﬁcation to varying degrees. [...] Pendulum: Model A We train with an ℓ1 loss and energy loss on the next state prediction, the exact loss we impose this model is (we denote (w T , h T , sωT ) as ground truth state): l(f) = f(w, h, sω) (w T , h T , sωT ) \| {z } ℓ1 loss + \|E(f(w, h, sω)) E((w T , h T , sωT ))\| \| {z } energy difference loss Re LU(E(f(w, h, sω)) E(w, h, sω)) \| {z } increase in energy loss [...] Cifar 10: Model A We use a network that is veriﬁably robust to adversarial pertubations of size 8/255 (where 255 is the range of pixel values) on 24.61% of the test examples, with respect to the standard speciﬁcation that the output of the network should remain invariant to adversarial perturbations of the input. The network consists of 4 convolutional layers and 3 linear layers in total 860000 paramters.