Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Scaling the Convex Barrier with Sparse Dual Algorithms
Authors: Alessandro De Palma, Harkirat Singh Behl, Rudy Bunel, Philip H.S. Torr, M. Pawan Kumar
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We alleviate this deficiency by presenting two novel dual algorithms: one operates a subgradient method on a small active set of dual variables, the other exploits the sparsity of Frank-Wolfe type optimizers to incur only a linear memory cost. Both methods recover the strengths of the new relaxation: tightness and a linear separation oracle. At the same time, they share the benefits of previous dual approaches for weaker relaxations: massive parallelism, GPU implementation, low cost per iteration and valid bounds at any time. As a consequence, we can obtain better bounds than off-the-shelf solvers in only a fraction of their running time, attaining significant formal verification speed-ups. ... Providing a detailed experimental evaluation of the new solver, both for incomplete and complete verification. |
| Researcher Affiliation | Academia | Alessandro De Palma EMAIL Harkirat Singh Behl EMAIL Rudy Bunel EMAIL Philip H.S. Torr EMAIL M. Pawan Kumar EMAIL Department of Engineering Science University of Oxford Oxford OX1 3PJ |
| Pseudocode | Yes | Algorithm 1 Active Set 1: function activeset compute bounds(Problem (5)) ... Algorithm 2 Saddle Point 1: function saddlepoint compute bounds(Problem (5)) |
| Open Source Code | Yes | Code implementing our algorithms is available as part of the OVAL neural network verification framework: https://github.com/oval-group/oval-bab. |
| Open Datasets | Yes | on the CIFAR-10 test set (Krizhevsky and Hinton, 2009). |
| Dataset Splits | Yes | we restricted the experiments to the first 2567 CIFAR-10 test set images for the experiments on the SGD-trained network (Figures 1, 2), and to the first 4129 images for the network trained via the method by Madry et al. (2018) (Figures 9, 10). ... We restricted the original data set to 100 properties per network so as to mirror the setup of the recent VNN-COMP competition (VNN-COMP, 2020). |
| Hardware Specification | Yes | Unless otherwise stated, experiments were run under the following setup: Ubuntu 16.04.2 LTS, on a single Nvidia Titan Xp GPU, except those based on Gurobi and the CPU version of our solvers. The latter were run on i7-6850K CPUs, utilising 4 cores for the incomplete verification experiments, and 6 cores for the more demanding complete verification setting. |
| Software Dependencies | No | The implementation of our algorithms, available at https://github.com/oval-group/oval-bab as part of the OVAL neural network verification framework, is based on Pytorch (Paszke et al., 2017). ... Gurobi Optimization, 2020. |
| Experiment Setup | Yes | For all supergradient-based methods (Big-M, Active Set), we employed the Adam update rule (Kingma and Ba, 2015), which showed stronger empirical convergence. For Big-M, replicating the findings by Bunel et al. (2020a) on their supergradient method, we linearly decrease the step size from 10 2 to 10 4. Active Set is initialized with 500 Big-M iterations, after which the step size is reset and linearly scaled from 10 3 to 10 6. We found the addition of variables to the active set to be effective before convergence: we add variables every 450 iterations, without re-scaling the step size again. Every addition consists of 2 new variables (see algorithm 1) and we cap the maximum number of cuts to 7. This was found to be a good compromise between fast bound improvement and computational cost. For Saddle Point, we use 1/(t + 10) as step size to stay closer to the initialization points. The primal initialization algorithm (see appendix C.2) is run for 100 steps on the Big-M variables, with step size linearly decreased from 10 2 to 10 5. |