The Missing Data Encoder: Cross-Channel Image Completion with Hide-and-Seek Adversarial Network
Authors: Arnaud Dapogny, Matthieu Cord, Patrick Perez10688-10695
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our models qualitatively and quantitatively on several datasets, showing their interest for image completion, representation learning as well as face occlusion handling. Experiments We validate our method both on three datasets, and show its interest for image completion, representation learning and face occlusion handling. MNIST contains 55k train and 10k test images. The Oxford-102 flowers dataset consists in 8187 images describing 102 classes of flowers. We train our models on 7167 images from the train and test partitions, and apply them on the 1020 validation images. Celeb A (Liu et al. 2015) is a large-scale database containing 202k 218 178 celebrity images coming from 10k identities, each annotated with 40 binary attributes (such as gender, eyeglasses, smile), and 5 landmarks. Qualitative evaluation Figure 5 shows images generated with MDE on the three datasets. Quantitative evaluation Evaluation metrics We use several metrics to assess the quality of the generated images. The peak signal-to-noise ratio (p SNR) quantifies the pixel-wise resemblance between the generated and ground truth images. The structural similarity (SSIM) index assesses the image holistic visual quality. Lastly, we measure the inception score (Salimans et al. 2016), which evaluates both the semantic relevance of the generated images as well as their diversity. Ablation study Figure 7 shows p SNR and inception score for multiple train and test scenarios. |
| Researcher Affiliation | Collaboration | Arnaud Dapogny,1,2 Matthieu Cord,2,3 Patrick Perez3 1Datakalab, 114 Boulevard Malesherbes, 75017 Paris 2LIP6, Sorbonne Universit e, 4 place Jussieu, 75005 Paris 3Valeo.ai, 43 Rue Bayen, 75017 Paris arnaud.dapogny@datakalab.com |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | For additional details and codes, please visit the project page at gitlab.com/adapo/themissingdataencoder. |
| Open Datasets | Yes | MNIST contains 55k train and 10k test images. The Oxford-102 flowers dataset consists in 8187 images describing 102 classes of flowers. Celeb A (Liu et al. 2015) is a large-scale database containing 202k 218 178 celebrity images coming from 10k identities, each annotated with 40 binary attributes (such as gender, eyeglasses, smile), and 5 landmarks. |
| Dataset Splits | Yes | MNIST contains 55k train and 10k test images. We train our models on 7167 images from the train and test partitions, and apply them on the 1020 validation images. As in (Zhong, Sullivan, and Li 2016), we use the train partition that contains 162k images from 8k identities to train our models. The test partition contains 20k instances from 1k identities that are different from the training set identities. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions "We use ADAM optimizer" but does not specify version numbers for any software components, libraries, or programming languages used. |
| Experiment Setup | Yes | We use ADAM optimizer with a learning rate of 2.10 4 for the generator and 2.10 5 for the discriminator. We train with a momentum of 0.5 and polynomial learning rate annealing. Finally, we apply 300 000 updates with batch size 24 to train the network. We set λvgg rec = 2.10 5, λadv = 10 2 and λHn S = 10.e 2 for LHn S coord and λHn S = 10 otherwise. |