FRIH: Fine-Grained Region-Aware Image Harmonization

Authors: Jinlong Peng, Zekun Luo, Liang Liu, Boshen Zhang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the effectiveness of our FRIH, we conduct experiments on the public image harmonization dataset i Harmony4 (Cong et al. 2020).
Researcher Affiliation Industry Jinlong Peng*, Zekun Luo, Liang Liu, Boshen Zhang Tencent Youtu Lab {jeromepeng, zekunluo, leoneliu, boshenzhang}@tencent.com
Pseudocode No The paper describes the proposed method, including components like the Base Network, Submask Extraction, Lightweight Cascaded Module, and Fusion Prediction Module, but it does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository.
Open Datasets Yes To demonstrate the effectiveness of our FRIH, we conduct experiments on the public image harmonization dataset i Harmony4 (Cong et al. 2020).
Dataset Splits Yes There are totally 65,742 training image pairs and 7,407 test image pairs in i Harmony4. All the image pairs are generated by modifying the specific foreground regions of the normal images, which are converted to corresponding inharmonious images in this way. We follow the same train-test split as Dove Net (Cong et al. 2020) in the experiments.
Hardware Specification Yes We use Adam Optimizer with β1 = 0.9 and β2 = 0.999 to train our model for 180 epochs on 8 Tesla V100 GPUs.
Software Dependencies No The paper mentions using 'Adam Optimizer' for training but does not specify any other software libraries or dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes The initial learning rate is 0.008, which decays by 10 at epoch 160 and 175. The batchsize is 128. All the images are resized to 256 256 in both training and test process. We use horizontal flip and random size crop to augment the data during training. The cutoff distance dc is set to 0.1.