Adversarially Robust Representations with Smooth Encoders
Authors: Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy (Dj) Dvijotham, Pushmeet Kohli
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate our approach on standard datasets and experimentally show that significant improvements in the downstream adversarial accuracy can be achieved by learning robust representations completely in an unsupervised manner... We show empirically using simulation studies on MNIST, color MNIST and Celeb A datasets, that models trained using our method learn representations that provide a higher degree of adversarial robustness even without supervised adversarial training. |
| Researcher Affiliation | Industry | Deep Mind, London {taylancemgil,sumedhg,dvij,pushmeet}@google.com |
| Pseudocode | Yes | Initialize η(0), θ(0) for τ = 1, 2, . . . do xa = Get Data(), xb = Select(xa; L, ϵ) (see Section 3.1) µa, Σa = f(xa; η), µb, Σb = f(xb; η) (Compute Representation) WD2,γ(η) = Entropy Regularized Wasserstein Divergence(µa, Σa, µb, Σb, γ) (see Apdx. B.2) u N(0, I) (Reparametrization Trick) E1(η, θ) = 1 2v xa g(µa + Σ1/2 a u; θ) 2 (Data Fidelity) E2(η) = 1 2 µa 2 + µb 2 + Tr{Σa + Σb} (Prior Fidelity) E(η, θ) = E1(η, θ) + E2(η) WD2,γ(η) η(τ), θ(τ) = Optimization Step(E, η(τ 1), θ(τ 1)) end |
| Open Source Code | No | The paper does not include an explicit statement about releasing code or a link to a code repository for their method. |
| Open Datasets | Yes | We run simulations on Color MNIST, MNIST and Celeb A datasets. |
| Dataset Splits | No | The paper refers to a 'test set' but does not provide specific percentages, sample counts, or a detailed methodology for splitting the datasets into training, validation, and test sets. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or cloud instance types used for the experiments. |
| Software Dependencies | No | The paper mentions general architectures and optimizers ('standard MLP and Conv Net architectures', 'Adam optimizer') but does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | Table 2: Experiment Hyperparameters...Training Representation Dimension, dim(Z) 32, 64, 128, 256 VAE or SE Observation noise variance, v 0.25, 0.5, 1., 2. Architecture MLP, Conv NET Training Coupling Strength γ 0.01, 0.1, 1, 5, 10, 50 SE Only Selection PGD Radius ϵ 0.01, 0.1, 0.2, 0.3 Selection PGD Iteration Budget L 1, 5, 10, 20, 50...Each network (both the encoder and decoder) are randomly initialized and trained for 300K iterations. |