Parameterized Rate-Distortion Stochastic Encoder
Authors: Quan Hoang, Trung Le, Dinh Phung
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We investigate the behavior of the algorithm on the MNIST (Le Cun et al., 1998) and Celeb A (Liu et al., 2015) datasets. For supervised learning, we demonstrate that the derived objective can be seen as a form of regularization that helps improve generalization. For robust learning, we show that introducing inductive bias to the learning of PARADISE can significantly improve interpretability as well as robustness to adversarial attacks on the Cifar-10 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) datasets. In particular, on the CIFAR-10 data set, our model reduces standard and adversarial error rates in comparison to the state-of-the-art (Qin et al., 2019) by 50% and 41%, respectively without the expensive computational cost of adversarial training. |
| Researcher Affiliation | Academia | 1Department of DSAI, Faculty of Information Technology, Monash University, Australia. |
| Pseudocode | Yes | Pseudocode of the learning algorithms are described in Sec. 1.6 of the supplementary material. |
| Open Source Code | Yes | And finally, for additional details and to encourage reproducibility, the supplementary material contains more extensive information on the experiments as well as additional results. We will also release our source codes in public domain. |
| Open Datasets | Yes | We investigate the behavior of the algorithm on the MNIST (Le Cun et al., 1998) and Celeb A (Liu et al., 2015) datasets. For supervised learning, we demonstrate that the derived objective can be seen as a form of regularization that helps improve generalization. For robust learning, we show that introducing inductive bias to the learning of PARADISE can significantly improve interpretability as well as robustness to adversarial attacks on the Cifar-10 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) datasets. |
| Dataset Splits | Yes | Fig. 1 plots the posterior pθ (z | x) as Gaussian ellipse representing the 95% confidence region for 2, 000 images from the test set. For all settings, we train models using 10 random seeds and take the average test accuracy, except for Image Net due to limited resources. Table 4: Top-1 accuracy (in %) on white-box attacks crafted on 2,000 Image Net validation images using 200 PGD steps. Table 5: Top-1 black-box accuracy (in %) on 2,000 Image Net validation images for different perturbation size ϵ. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, memory amounts, or detailed computer specifications) were found for the authors' own experimental setup. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library or solver names with versions) were mentioned. |
| Experiment Setup | Yes | Details about the architecture and hyperparameters are in Sec. 2 of the supplementary material. For Cifar-10, we use the base wide Res Net WRN-28-10 architecture (Zagoruyko & Komodakis, 2016). For all settings, we train models using 10 random seeds and take the average test accuracy, except for Image Net due to limited resources. Our method only requires doubling the batch size for posterior matching. |