Improved StyleGAN-v2 based Inversion for Out-of-Distribution Images
Authors: Rakshith Subramanyam, Vivek Narayanaswamy, Mark Naufel, Andreas Spanias, Jayaraman J. Thiagarajan
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical studies on a suite of OOD data show that, in addition to producing higher quality reconstructions over the state-of-the-art inversion techniques, SPHIn X is effective for ill-posed restoration tasks while offering semantic editing capabilities.Extensive empirical studies on in-the wild face and non-face image data to demonstrate the efficacy of SPHIn X in reconstruction, semantic editing and solving challenging inverse problems denoising, compressed recovery and simultaneous inversion & attribute discovery (Appendix B); (e) Systematic study of the behavior of different existing latent space optimization strategies using a broad suite of image datasets |
| Researcher Affiliation | Collaboration | 1Arizona State University 2Lawrence Livermore National Laboratories, Livermore, CA, USA. |
| Pseudocode | Yes | A. Algorithm Listing for SPHIn X Algorithm 1 provides the details of training SPHIn X 3. Algorithm 1 SPHIn X |
| Open Source Code | Yes | Our codes are publicly accessible2. 2https://github.com/Rakshith-2905/SPHIn X |
| Open Datasets | Yes | FFHQ faces, Animal Faces-HQ (AFHQ) (Choi et al., 2020): This dataset contains 16,130 high-quality images of various breeds of cats, dog and other wildlife, Diabetic Retinopathy Images (Retina) (ret): This dataset consists of 88,702 high-resolution, left and right eye retina images taken under a variety of imaging conditions, ISIC 2018 Skin Lesions (Codella et al., 2019): This dataset contains a total of 10,015 dermoscopic lesion images drawn from the HAM10000 database (Tschandl et al., 2018), Mimic CXR (Johnson et al., 2019): This is a large public database containing 377,100 chest radiographs (X-rays) corresponding to a variety of radiographic studies |
| Dataset Splits | No | The paper describes the datasets used and how images were processed (resized, rescaled), but does not explicitly provide details about training, validation, and test splits for these datasets. It mentions training for a certain number of iterations but not how the data was partitioned into these sets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions using Style GAN-v2 and the ADAM optimizer, but it does not specify version numbers for any software, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | The bottleneck block in Ps of SPHIn X is constructed using fully connected (FC) layers of sizes [512, 256, 16] and each of the decoders is another block of FCs of sizes [32, 64, 512]. The architecture for Pc is a fully convolutional network comprised of three Conv2D layers with 32, 128 and 512 filters respectively. For each image, we trained all the compared methods for 10, 000 iterations using the ADAM optimizer with β1 = 0.9 and β2 = 0.999. We employ a trapezoid-based learning rate schedule with a maximum learning rate of 0.001. We used Nℓ= 18 corresponding to an image resolution of 1024 1024. For SPHIn X and the Z+ baselines, we use a standard normal distribution N(0, I) RNℓ 512 as the prior for sampling z+. All approaches are trained using a combination of the mean-squared error (MSE) and LPIPS losses. |