reproducibilityindex.ai

SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models

Authors: Zhaoyang Sun, Shengwu Xiong, Yaxiong Chen, Fei Du, Weihua Chen, Fan Wang, Yi Rong

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive quantitative and qualitative analyses demonstrate the effectiveness of our method. Our code is available at https://github.com/Snowfallingplum/SHMT. 4 Experiments 4.1 Experimental settings Datasets. Following [21, 17], we randomly select 90% of the images from the MT dataset [21] as training samples and the rest as test samples. In addition, Wild-MT [17] and LADN [11] datasets are also used to validate the performance and generalization capability of our model. The images in the Wild-MT dataset contain large pose and expression variations, and the LADN dataset collects a number of images with complex makeup styles.
Researcher Affiliation	Collaboration	Zhaoyang Sun1,3 Shengwu Xiong1,2,5 Yaxiong Chen1 Fei Du3,4 Weihua Chen3,4 Fan Wang3,4 Yi Rong1,2 1School of Computer Science and Artificial Intelligence, Wuhan University of Technology 2Sanya Science and Education Innovation Park, Wuhan University of Technology 3DAMO Academy, Alibaba Group 4Hupan Laboratory 5Shanghai AI Laboratory
Pseudocode	No	The paper does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/Snowfallingplum/SHMT.
Open Datasets	Yes	The MT [21], Wild-MT [17] and LADN [11] datasets that used in our experiments have already been released and can be found in the following links: MT dataset: https://github.com/wtjiang98/BeautyGAN_pytorch. Wild-MT dataset:https://github.com/wtjiang98/PSGAN. LADN dataset: https://github.com/wangguanzhi/LADN.
Dataset Splits	No	Following [21, 17], we randomly select 90% of the images from the MT dataset [21] as training samples and the rest as test samples.
Hardware Specification	Yes	We train the model with Adam optimizer, learning rate of 1e-6 and batch size of 16 on a single A100 GPU.
Software Dependencies	No	The paper mentions software components like LDM, UNet denoiser, DDIM sampler, and Adam optimizer but does not specify their version numbers.
Experiment Setup	Yes	In our experiments, we discover that the autoencoder with a downsampling factor of 4 preserves texture details better than the one with a factor of 8. Therefore, the autoencoder with a downsampling factor of 4 is selected, and the SHMT model is trained at a resolution of 256 x 256. The specific structure of the UNet denoiser ϵθ remains the same as the LDM [31], with IDA module replacing the original conditional injection module. In Equation 3, τ is set to 100. We train the model with Adam optimizer, learning rate of 1e-6 and batch size of 16 on a single A100 GPU. Our model is trained for 250,000 steps in about 5 days. For sampling, we utilize 50 steps of the DDIM sampler [35].