Context-Aware Image Inpainting with Learned Semantic Priors

Authors: Wendong Zhang, Junwei Zhu, Ying Tai, Yunbo Wang, Wenqing Chu, Bingbing Ni, Chengjie Wang, Xiaokang Yang

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental It achieves the state of the art on Places2, Celeb A, and Paris Street View datasets .
Researcher Affiliation Collaboration Wendong Zhang1 , Junwei Zhu2 , Ying Tai2 , Yunbo Wang1 , Wenqing Chu2 , Bingbing Ni1 , Chengjie Wang2 and Xiaokang Yang1 1Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University 2Youtu Lab, Tencent
Pseudocode No The paper describes the model architecture and training process in text and diagrams (Figure 2) but does not provide any structured pseudocode or algorithm blocks.
Open Source Code Yes Code available at https://github.com/Wendong Zh/SPL
Open Datasets Yes We evaluate our approach on three datasets, including Places2 [Zhou et al., 2017], Celeb A [Liu et al., 2015], and Paris Street View [Doersch et al., 2015].
Dataset Splits Yes We use the standard training set of Places2 with over 4 million images and evaluate all models on its validation set with 36,500 images. The Celeb A human face dataset contains over 160,000 training images and about 19,900 testing images. The Street View dataset contains 15,900 training samples and 100 testing samples. We use the mask set of [Liu et al., 2018] with 12,000 irregular masks that are pre-grouped into six intervals according to the size of the masks (0%-10%, 10%-20%, . . ., 50%-60%).
Hardware Specification Yes The batch size is set to 8 for all datasets and we train our model with two V100 GPUs.
Software Dependencies No The paper mentions optimizers (Adam solver) and loss functions (asymmetric loss) but does not provide specific version numbers for programming languages or deep learning frameworks (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes We apply the Adam solver with β1 = 0.0 and β2 = 0.9 for model optimization. The initial learning rate is 1e 4 for all experiments and decayed to 1e 5 at different epochs for different datasets. For Places2 and Celeb A, we decay the learning rate at 30 epochs and further fine-tune the model for another 10 epochs. For Paris Street View, we decay the learning rate at 50 epochs and fine-tune the model for 20 epochs. The batch size is set to 8 for all datasets