Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries
Authors: Arjun Nitin Bhagoji, Daniel Cullina, Vikash Sehwag, Prateek Mittal
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use our algorithm to find lower bounds on the cross-entropy loss for these benchmark datasets, as well as for synthetic Gaussian data. Comparing these bounds to the training loss obtained by state-of-the-art robust optimization techniques on commonly used deep neural networks, we find a gap in terms of convergence to the optimal loss. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Chicago 2Department of Electrical and Computer Engineering, Pennsylvania State University 3Department of Electrical Engineering, Princeton University. |
| Pseudocode | Yes | Algorithm 1 Opt Prob |
| Open Source Code | Yes | The code to reproduce all results in this paper is available at https://github. com/arjunbhagoji/log-loss-lower-bounds. |
| Open Datasets | Yes | MNIST (Le Cun & Cortes, 1998), Fashion MNIST (Xiao et al., 2017) and CIFAR-10 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | No | The paper mentions 'training samples' and evaluates on 'training data' and 'test data', but it does not specify a separate validation dataset or its split details. |
| Hardware Specification | Yes | All results are obtained on an Intel Xeon cluster with 8 P100 GPUs. |
| Software Dependencies | No | The paper mentions using 'maximum flow algorithm from Scipy (Virtanen, 2020)' and 'general purpose solver for convex programs with non-linear objective functions from CVXOPT (Andersen et al., 2013)', but it does not specify explicit version numbers for these software packages. |
| Experiment Setup | Yes | We train a Res Net-18 network using adversarial training and TRADES... We choose the 3 vs. 7 classification task as a representative binary classification problem... In each case, there are a total of n = 5000 training samples per class... We pick the commonly used ℓ2-norm ball constraint... TRADES (β = 1.0) and TRADES (β = 6.0). |