Learning to Manipulate Artistic Images

Authors: Wei Guo, Yuqi Zhang, De Ma, Qian Zheng

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Both qualitative and quantitative experiments demonstrate the superiority of our method over state-of-the-art methods. Code is available at https://github.com/Snail Force/SIM-Net. and Experiments, Implementation details., Training dataset., Testing dataset., Evaluation Metrics., Overall Performance Comparison with state-of-the-art., Ablation Study.
Researcher Affiliation Academia Wei Guo*, Yuqi Zhang*, De Ma , Qian Zheng Zhejiang University {snailforce, yq zhang, made, qianzheng}@zju.edu.cn
Pseudocode No The paper describes the proposed method and its modules in detail and provides figures illustrating the architecture, but it does not include any formal pseudocode blocks or algorithms labeled as such.
Open Source Code Yes Code is available at https://github.com/Snail Force/SIM-Net.
Open Datasets Yes Our training data is composed of 1,000 faces images from Celeb AMask-HQ (Lee et al. 2020), 227 horses images from Weizmann Horse Database (Borenstein, Sharon, and Ullman 2004), and 696/573 mountains/buildings images from Intel Image Classification.
Dataset Splits No The paper mentions 'Training dataset' and 'Testing dataset' but does not provide specific numerical splits (e.g., percentages or counts) for training, validation, and test sets. It also does not explicitly mention a 'validation' set or how it was used in relation to data splits.
Hardware Specification Yes The experiments are conducted using 4 TITAN Xp GPUs.
Software Dependencies No The paper mentions using the Adam solver but does not specify any software versions for programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes Implementation details. The learning rate for the framework is 2e 4, and we use the Adam solver with β1 = 0.5 and β2 = 0.999 for the optimization. K is set to be 10. Unless otherwise specified, the resolution of generated images for translation tasks is 256 256 for fair comparison. and The overall objective function of proposed framework is: Ltotal =λ1Leq + λ2Lperc + λ3Lcontext + λ4Lbound + λ5(LI mask + LS mask) + λ6Lrec + λ7Lcyc, where λ is weighting parameter.