reproducibilityindex.ai

Combinatorial Attacks on Binarized Neural Networks

Authors: Elias B Khalil, Amrita Gupta, Bistra Dilkina

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, we evaluate both proposed methods against the standard gradient-based attack (PGD) on MNIST and Fashion-MNIST, and show that IProp performs favorably compared to PGD, while scaling beyond the limits of the MILP.
Researcher Affiliation	Academia	Elias B. Khalil College of Computing Georgia Tech lyes@gatech.edu Amrita Gupta College of Computing Georgia Tech agupta375@gatech.edu Bistra Dilkina Department of Computer Science University of Southern California dilkina@usc.edu
Pseudocode	Yes	IProp (x, ϵ, BNN weight matrices {Wl}D l=1, prediction, target, step size S)
Open Source Code	No	The paper mentions using 'BNN code 1 by Courbariaux et al. (2016)' with a link to a GitHub repository. This is code from a third party that they used, not their own source code for the MILP or IProp methodology described in the paper.
Open Datasets	Yes	We evaluate the MILP model, IProp and the Projected Gradient Descent method (with restarts) (PGD) (Madry et al., 2017) a representative gradient-based attack on BNN models pre-trained on the MNIST (Le Cun et al., 1998) and Fashion-MNIST (Xiao et al., 2017) datasets.
Dataset Splits	No	The paper mentions '60,000 MNIST and Fashion-MNIST training images' and '1,000 test points from the MNIST dataset and 100 test points from the Fashion-MNIST dataset', but it does not specify a separate validation split or its size.
Hardware Specification	Yes	We train networks with the following depth x width values: 2x100, 2x200, 2x300, 2x400, 2x500, 3x100, 4x100, 5x100. While these networks are not large by current deep learning standards, they are larger than most networks used in recent papers (Fischetti & Jo, 2018; Narodytska et al., 2017) that leverage integer programming or SAT solving for adversarial attacks or veriﬁcation. All BNNs are trained to minimize the cross-entropy loss with batch normalization (Ioffe & Szegedy, 2015) for 100 epochs on the full 60,000 MNIST and Fashion-MNIST training images, achieving between 90 95% test accuracy on MNIST, and 80 90% on Fashion-MNIST.
Software Dependencies	No	The paper mentions using 'Gurobi Python API' and 'Py Torch' but does not provide specific version numbers for these software components.
Experiment Setup	Yes	All BNNs are trained to minimize the cross-entropy loss with batch normalization (Ioffe & Szegedy, 2015) for 100 epochs on the full 60,000 MNIST and Fashion-MNIST training images, achieving between 90 95% test accuracy on MNIST, and 80 90% on Fashion-MNIST.