reproducibilityindex.ai

Swapping Autoencoder for Deep Image Manipulation

Authors: Taesung Park, Jun-Yan Zhu, Oliver Wang, Jingwan Lu, Eli Shechtman, Alexei Efros, Richard Zhang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on multiple datasets show that our model produces better results and is substantially more efﬁcient compared to recent generative models.
Researcher Affiliation	Collaboration	1UC Berkeley 2Adobe Research
Pseudocode	No	The paper provides architectural diagrams and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions a 'project webpage' for a demo video and interactive UI, but does not explicitly state that source code for the methodology is released or provide a link to a code repository.
Open Datasets	Yes	For existing datasets, our model is trained on LSUN Churches, Bedrooms [80], Animal Faces HQ (AFHQ) [12], Flickr Faces HQ (FFHQ) [43], all at resolution of 256px except FFHQ at 1024px. In addition, we introduce new datasets, which are Portrait2FFHQ, a combined dataset of 17k portrait paintings from wikiart.org and FFHQ at 256px, Flickr Mountain, 0.5M mountain images from flickr. com, and Waterfall, of 90k 256px waterfall images.
Dataset Splits	No	The paper mentions training on various datasets (LSUN Churches, Bedrooms, AFHQ, FFHQ, Portrait2FFHQ, Flickr Mountain, Waterfall) but does not specify the training, validation, and test splits (e.g., percentages, sample counts, or explicit references to standard splits).
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., exact GPU/CPU models, memory, or cloud computing specifications) used to run its experiments.
Software Dependencies	No	The paper does not provide specific details about ancillary software dependencies, such as programming languages, libraries, or frameworks with their version numbers (e.g., 'PyTorch 1.9', 'CUDA 11.1').
Experiment Setup	Yes	Our ﬁnal objective function for the encoder and generator is Ltotal =Lrec+0.5LGAN,rec+0.5LGAN,swap+ LCooccur GAN. The discriminator objective and design follows Style GAN2 [44]. The encoder consists of 4 downsampling Res Net [22] blocks to produce the tensor zs, and a dense layer after average pooling to produce the vector zt. Please see Appendix ?? for a detailed speciﬁcation of the architecture, as well as details of the discriminator loss function.