Painterly Image Harmonization by Learning from Painterly Objects
Authors: Li Niu, Junyan Cao, Yan Hong, Liqing Zhang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the benchmark dataset demonstrate the effectiveness of our proposed method. |
| Researcher Affiliation | Academia | Mo E Key Lab of Artificial Intelligence, Shanghai Jiao Tong University {ustcnewly, joy c1, hy2628982280, lqzhang}@sjtu.edu.cn |
| Pseudocode | No | The paper describes its method and network structure in detail but does not provide any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | 3) We will release our annotated reference images/objects, which would greatly benefit the future research of painterly image harmonization. |
| Open Datasets | Yes | Based on 57, 025 artistic paintings in the training set of Wiki Art (Nichol 2016), we use off-the-shelf object detection model (Wu et al. 2019) pretrained on COCO (Lin et al. 2014) dataset to detect objects in artistic paintings. |
| Dataset Splits | No | The paper mentions using training data from COCO and Wiki Art and refers to '100 test images' for efficiency analysis, but it does not specify the explicit percentages or counts for training, validation, and test splits within the main text. |
| Hardware Specification | Yes | Our network is implemented using Pytorch 1.10.0 and trained using Adam optimizer with learning rate of 1e 4 on ubuntu 20.04 LTS operation system, with 128GB memory, Intel(R) Xeon(R) Silver 4116 CPU, and one Ge Force RTX 3090 GPU. |
| Software Dependencies | Yes | Our network is implemented using Pytorch 1.10.0 and trained using Adam optimizer with learning rate of 1e 4 on ubuntu 20.04 LTS operation system, with 128GB memory, Intel(R) Xeon(R) Silver 4116 CPU, and one Ge Force RTX 3090 GPU. |
| Experiment Setup | Yes | Our network is implemented using Pytorch 1.10.0 and trained using Adam optimizer with learning rate of 1e 4 on ubuntu 20.04 LTS operation system, with 128GB memory, Intel(R) Xeon(R) Silver 4116 CPU, and one Ge Force RTX 3090 GPU. For the encoder and decoder structure, we follow (Cao, Hong, and Niu 2023). For P module, we use one residual block (He et al. 2016). For Ml module in the l-th encoder layer, we stack three Res MLP Layers (Touvron et al. 2023), in which the intermediate dimension is equal to the dimension of style vector in the l-th layer. |