Detecting Overfitting via Adversarial Examples
Authors: Roman Werpachowski, András György, Csaba Szepesvari
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We develop a specialized variant of our test for multiclass image classification, and apply it to testing overfitting of recent models to the popular Image Net benchmark. Our method correctly indicates overfitting of the trained model to the training set, but is not able to detect any overfitting to the test set, in line with other recent work on this topic. To understand the behavior of our tests better, we first use them on a synthetic binary classification problem, where the tests are able to successfully identify the cases where overfitting is present. Then we apply our independence tests to state-of-the-art classification methods for the popular image classification benchmark, Image Net [8]. |
| Researcher Affiliation | Industry | Roman Werpachowski András György Csaba Szepesvári Deep Mind, London, UK {romanw,agyorgy,szepi}@google.com |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of its source code. |
| Open Datasets | Yes | We applied our test to check if state-of-the-art classifiers for the Image Net dataset [8] have been overfitted to the test set. In particular, we use the VGG16 classifier of [27] and the Resnet50 classifier of [16]. |
| Dataset Splits | No | The paper discusses 'training set' and 'test set' but does not explicitly mention 'validation' splits or percentages for any of the datasets used. |
| Hardware Specification | No | The paper mentions 'computational considerations' and 'computational resources' but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper mentions using VGG16 and Resnet50 models, which are deep learning architectures, but it does not specify any software environments, libraries, or their version numbers (e.g., TensorFlow, PyTorch, Python version). |
| Experiment Setup | Yes | The models were trained using the parameters recommended by their respective authors. The preprocessing procedure of both architectures involves rescaling every image so that the smaller of width and height is 256 and next cropping centrally to size 224 224. To control the amount of change, we limit the magnitude of translations and allow v Vε = {u Z2 : u = (0, 0), u ε} only, for some fixed positive ε. we only analyzed a single trained VGG16 model, while the Resnet50 model was retrained 120 times. |