SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models
Authors: Zhaoyang Sun, Shengwu Xiong, Yaxiong Chen, Fei Du, Weihua Chen, Fan Wang, Yi Rong
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive quantitative and qualitative analyses demonstrate the effectiveness of our method. Our code is available at https://github.com/Snowfallingplum/SHMT. 4 Experiments 4.1 Experimental settings Datasets. Following [21, 17], we randomly select 90% of the images from the MT dataset [21] as training samples and the rest as test samples. In addition, Wild-MT [17] and LADN [11] datasets are also used to validate the performance and generalization capability of our model. The images in the Wild-MT dataset contain large pose and expression variations, and the LADN dataset collects a number of images with complex makeup styles. |
| Researcher Affiliation | Collaboration | Zhaoyang Sun1,3 Shengwu Xiong1,2,5 Yaxiong Chen1 Fei Du3,4 Weihua Chen3,4 Fan Wang3,4 Yi Rong1,2 1School of Computer Science and Artificial Intelligence, Wuhan University of Technology 2Sanya Science and Education Innovation Park, Wuhan University of Technology 3DAMO Academy, Alibaba Group 4Hupan Laboratory 5Shanghai AI Laboratory |
| Pseudocode | No | The paper does not contain explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/Snowfallingplum/SHMT. |
| Open Datasets | Yes | The MT [21], Wild-MT [17] and LADN [11] datasets that used in our experiments have already been released and can be found in the following links: MT dataset: https://github.com/wtjiang98/BeautyGAN_pytorch. Wild-MT dataset:https://github.com/wtjiang98/PSGAN. LADN dataset: https://github.com/wangguanzhi/LADN. |
| Dataset Splits | No | Following [21, 17], we randomly select 90% of the images from the MT dataset [21] as training samples and the rest as test samples. |
| Hardware Specification | Yes | We train the model with Adam optimizer, learning rate of 1e-6 and batch size of 16 on a single A100 GPU. |
| Software Dependencies | No | The paper mentions software components like LDM, UNet denoiser, DDIM sampler, and Adam optimizer but does not specify their version numbers. |
| Experiment Setup | Yes | In our experiments, we discover that the autoencoder with a downsampling factor of 4 preserves texture details better than the one with a factor of 8. Therefore, the autoencoder with a downsampling factor of 4 is selected, and the SHMT model is trained at a resolution of 256 x 256. The specific structure of the UNet denoiser ϵθ remains the same as the LDM [31], with IDA module replacing the original conditional injection module. In Equation 3, τ is set to 100. We train the model with Adam optimizer, learning rate of 1e-6 and batch size of 16 on a single A100 GPU. Our model is trained for 250,000 steps in about 5 days. For sampling, we utilize 50 steps of the DDIM sampler [35]. |