Unsupervised Meta-Learning of Figure-Ground Segmentation via Imitating Visual Effects
Authors: Ding-Jie Chen, Jui-Ting Chien, Hwann-Tzong Chen, Tyng-Luh Liu8159-8166
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our approach via extensive experiments on six datasets to demonstrate that the proposed model can be end-to-end trained without ground-truth pixel labeling yet outperforms the existing methods of unsupervised segmentation tasks. Experiments We first describe the evaluation metric, the testing datasets, the training data, and the algorithms in comparison. Then, we show the comparison results of the relevant algorithms and our approach. Finally, we present the image segmentation and editing results of our approach. Quantitative Evaluation The first part of experiment aims to evaluate the segmentation quality of different methods. |
| Researcher Affiliation | Academia | Ding-Jie Chen, Jui-Ting Chien, Hwann-Tzong Chen, Tyng-Luh Liu Institute of Information Science, Academia Sinica, Taiwan Department of Computer Science, National Tsing Hua University, Taiwan {djchen.tw, ydnaandy123}@gmail.com, htchen@cs.nthu.edu.tw, liutyng@iis.sinica.edu.tw |
| Pseudocode | No | The paper describes the system architecture and components (Generator, Discriminator, Editor) in detail but does not provide a formal pseudocode block or algorithm section. |
| Open Source Code | No | The paper does not include an explicit statement or link to publicly available source code for the methodology described. |
| Open Datasets | Yes | Training Data. In training the VEGAN model, we consider using the images from two different sources for comparison. The first image source is MSRA9500 derived from the MSRA10K dataset (Cheng et al. 2015). The second image source is Flickr, and we acquire unorganized images for each task as the training data. ... For Flickr images, we use black background, color selectivo, and defocus/Bokeh as the three query tags, and then collect 4,000 images for each query-tag as the real images with visual effects. |
| Dataset Splits | No | The paper mentions partitioning MSRA10K into 500 (testing) and 9,500 (training) images but does not specify a separate validation dataset split. |
| Hardware Specification | Yes | All algorithms are tested on Intel i7-4770 3.40 GHz CPU, 8GB RAM, and NVIDIA Titan X GPU. |
| Software Dependencies | No | The paper refers to various frameworks and models (e.g., DCGAN, WGAN-GP, ResNet, VGG16, CycleGAN) but does not provide specific version numbers for software components like programming languages, libraries, or frameworks (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | We set the learning rate, λgp, and other hyper-parameters the same as the configuration of WGAN-GP (Gulrajani et al. 2017). We keep the history of previously generated images and update the discriminator according to the history. We use the same way as (Zhu et al. 2017) to store 50 previously generated images {Iedit} in a buffer. The training images are of size 224 224, and the batch size is 1. From the results just described, the final VEGAN model is implemented with the following setting: i) Generator uses the 9-residual-blocks version of (Johnson, Alahi, and Fei-Fei 2016). ii) Discriminator uses the full-image discriminator as WGAN-GP (Gulrajani et al. 2017). |