MUSICAL: Multi-Scale Image Contextual Attention Learning for Inpainting
Authors: Ning Wang, Jingyuan Li, Lefei Zhang, Bo Du
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the Paris Street View, Places, and Celeb A datasets indicate the superior performance of our approach compares to the state-of-the-arts. |
| Researcher Affiliation | Academia | Ning Wang , Jingyuan Li , Lefei Zhang and Bo Du School of Computer Science, Wuhan University {wang ning, jingyuanli,zhanglefei,remoteking}@whu.edu.cn |
| Pseudocode | No | The paper describes its proposed algorithm in text and mathematical equations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Dataset In this section, we conduct experiments to investigate the effectiveness of our MUSICAL algorithm on three public image datasets including the Paris Street View [Doersch et al., 2012], Places [Zhou et al., 2018], and Celeb A [Liu et al., 2015]. |
| Dataset Splits | Yes | The Paris Street View contains 14,900 training images and 100 test images. The Places dataset is the canyon scene selected from Places365-Standard dataset, and this category has 5,000 training images, 900 test images and 100 validation images. In our experiment, we use the training set for training and the validation set for testing. And the Celeb A dataset contains 162,770 training images, 19,867 validation images and 19,962 test images. We use both of the training set and validation set for training, and use the test set for testing. |
| Hardware Specification | Yes | All the experiments are conducted with the Python on Ubuntu 17.10 system, with i7-6800K 3.40GHz CPU and 12G NVIDIA Titan Xp GPU. |
| Software Dependencies | No | The paper mentions 'Python on Ubuntu 17.10 system' but does not specify the version of Python or any other key software dependencies with version numbers (e.g., specific deep learning frameworks or libraries). |
| Experiment Setup | Yes | For both Paris Street View and Places, we resize each training image to let its minimal length/width be 350, and randomly crop a subimage of size 256 256 as input to our model. As for Celeb A, we resize each training image to let its minimal length/width be 256, and crop a subimage of size 256 256 at the center as input to our model. The size of mask is 128 128 for each image, and the mask is located in the center of the image. We train the model with a batch size of 5 for each epoch. For all the datasets, the tradeoff parameters are set as λsty = 250, λperc = 0.07, λtv = 0.001 and λl1 = 100. While the λadv is different in these datasets. Specifically, for the Celeb A dataset, we have set λadv = 0.0, while we set λadv = 0.3 for the other two datasets. |