Verification of Non-Linear Specifications for Neural Networks

Authors: Chongli Qin, Krishnamurthy (Dj) Dvijotham, Brendan O'Donoghue, Rudy Bunel, Robert Stanforth, Sven Gowal, Jonathan Uesato, Grzegorz Swirszcz, Pushmeet Kohli

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental evaluation shows that our method is able to effectively verify these specifications. Moreover, our evaluation exposes the failure modes in models which cannot be verified to satisfy these specifications. Thus, emphasizing the importance of training models not just to fit training data but also to be consistent with specifications. 4 EXPERIMENTS We have proposed a novel set of specifications to be verified as well as new verification algorithms that can verify whether these specifications are satisfied by neural networks. In order to validate our contributions experimentally, we perform two sets of experiments:
Researcher Affiliation Industry Deep Mind London, N1C 4AG, UK correspondence: chongliqin@google.com
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets Yes We study the semantic distance specification (1) in the context of the CIFAR-10 dataset. We define the distances d(i, j) between labels as their distance according to Wordnet (Miller, 1995) (the full distance matrix used is shown in Appendix F). [...] MNIST and CIFAR-10: For these datasets both models was trained to maximize the log likelihood of true label predictions while being robust to adversarial examples via the method described in (Wong & Kolter, 2018). The training for model A places a heavier loss than model B when robustness measures are violated. Mujoco: To test the energy specification, we used the Mujoco physics engine (Todorov et al., 2012) to create a simulation of a simple pendulum with damping friction.
Dataset Splits No The paper mentions a test set for the Mujoco dataset ('27000 was set aside as test set'), but it does not specify a validation set split or other dataset split details for any of the datasets used.
Hardware Specification No The paper mentions 'our desktop machine (with 1 GPU and 8G of memory)' but does not provide specific GPU or CPU models, or further detailed hardware specifications used for running experiments.
Software Dependencies No The paper mentions software tools like 'GLOP as the LP solver' and 'CVXPY' but does not specify their version numbers, which are necessary for reproducible descriptions of software dependencies.
Experiment Setup Yes 4.2 EXPERIMENTAL SETUP For each of the specifications we trained two networks (referred to as model A and B in the following) that satisfy our specification to varying degrees. [...] Pendulum: Model A We train with an ℓ1 loss and energy loss on the next state prediction, the exact loss we impose this model is (we denote (w T , h T , sωT ) as ground truth state): l(f) = f(w, h, sω) (w T , h T , sωT ) | {z } ℓ1 loss + |E(f(w, h, sω)) E((w T , h T , sωT ))| | {z } energy difference loss Re LU(E(f(w, h, sω)) E(w, h, sω)) | {z } increase in energy loss [...] Cifar 10: Model A We use a network that is verifiably robust to adversarial pertubations of size 8/255 (where 255 is the range of pixel values) on 24.61% of the test examples, with respect to the standard specification that the output of the network should remain invariant to adversarial perturbations of the input. The network consists of 4 convolutional layers and 3 linear layers in total 860000 paramters.