Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations
Authors: Amin Ghiasi, Hamid Kazemi, Steven Reich, Chen Zhu, Micah Goldblum, Tom Goldstein
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the practicality of our approach by inverting Vision Transformers (Vi Ts) and Multi-Layer Perceptrons (MLPs) trained on the Image Net dataset, tasks which to the best of our knowledge have not been successfully accomplished by any previous works. To quantitatively evaluate our method, we invert both a pretrained Vi T model and a pretrained Res MLP model to produce one image per class using PII, and do the same using Deep Dream (i.e., Deep Inversion minus feature regularization, which is not available for this model). We then use a variety of pre-trained models to classify these images. Table 2 contains the mean top-1 and top-5 classification accuracies across these models, as well as Inception scores, for the generated images from each method. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Maryland, College Park, USA 2New York University Center for Data Science, New York, USA. |
| Pseudocode | Yes | Full pseudocode for the algorithm may be found in appendix E. (Appendix E contains 'Algorithm 1 Optimization procedure for Plug-In Inversion') |
| Open Source Code | Yes | We also make the code used for all demonstrations and experiments in this work available at https://github.com/ youranonymousefriend/plugininversion. |
| Open Datasets | Yes | We use a robust Res Net-50 (He et al., 2016) model trained on the Image Net (Deng et al., 2009) dataset... In Figure 9, we use PII to invert Vi T models trained on Image Net and fine-tuned on CIFAR-100. Figure 10 shows inversion results from models fine-tuned on CIFAR-10. |
| Dataset Splits | No | The paper mentions using standard datasets like ImageNet, CIFAR-100, and CIFAR-10, but it does not explicitly provide the specific percentages or sample counts for training, validation, and test splits used in their experiments. While these datasets have common splits, the paper does not specify the exact split used for its own experimental setup. |
| Hardware Specification | No | The paper mentions 'available GPU memory' when discussing ensemble size, but it does not specify any particular GPU models, CPU types, or other hardware components used to run the experiments. |
| Software Dependencies | No | The paper mentions 'torchvision (Paszke et al., 2019)' and 'PyTorch (Paszke et al., 2019) notation' but does not provide specific version numbers for these libraries or any other software dependencies. |
| Experiment Setup | Yes | In order to tune hyper-parameters of PII for use on naturally-trained models, we use the torchvision (Paszke et al., 2019) Image Net-trained Res Net-50 model. We apply centering + zoom simultaneously in 7 stages. During each stage, we optimize the selected patch for 400 iterations, applying random jitter and Color Shift at each step. We use the Adam (Kingma & Ba, 2014) optimizer with momentum βm = (0.5, 0.99), initial learning rate lr = 0.01, and cosine-decay. At the beginning of every stage, the learning rate and optimizer are re-initialized. We use α = β = 1.0 for the Color Shift parameters, and an ensemble size of e = 32. For the PII experiments, we optimize the images starting from images of size 8, and we increase the size of image after every 400 iterations by 4. Learning rate is initially set to 0.01 and decayed by Cosine Annealing learning rate schedule. We use the ADAM optimizer with hyper-parameters β = (0.5, 0.99) and ϵ = 10 8. The learning rate and momentum parameters are set back to their original value every time the image size is increased. |