MODNet: Real-Time Trimap-Free Portrait Matting via Objective Decomposition
Authors: Zhanghan Ke, Jiayu Sun, Kaican Li, Qiong Yan, Rynson W.H. Lau1140-1147
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that MODNet outperforms prior trimap-free methods by a large margin on both Adobe Matting Dataset and a carefully designed photographic portrait matting (PPM-100) benchmark proposed by us. |
| Researcher Affiliation | Collaboration | Zhanghan Ke1,2, Jiayu Sun1, Kaican Li2, Qiong Yan2, Rynson W.H. Lau1 1Department of Computer Science, City University of Hong Kong 2Sense Time Research |
| Pseudocode | No | The paper describes the architecture and methodology of MODNet, including its branches and novel techniques, but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and models: https://github.com/ZHKKKe/MODNet |
| Open Datasets | Yes | We train all models on the same dataset, which contains nearly 3, 000 annotated foregrounds. Background replacement (Xu et al. 2017) is applied to extend our training set. All images in our training set are collected from Flickr and are annotated by Photoshop. The training set contains 2, 600 half-body and 400 fullbody portraits. For each labeled foreground, we generate 5 samples by random cropping and 10 samples by compositing with the images from the Open Image dataset (Kuznetsova et al. 2018) (as the background). We use Mobile Net V2 pre-trained on the Supervisely Person Segmentation (SPS) dataset as the backbone of all trimap-free models. |
| Dataset Splits | No | The paper mentions validating models on the Adobe Matting Dataset and the PPM-100 benchmark, but does not provide explicit details about the split percentages or sample counts for training, validation, and test sets to allow for exact reproduction of data partitioning. |
| Hardware Specification | Yes | MODNet is much faster than contemporaneous methods and runs at 67 frames per second on a 1080Ti GPU. |
| Software Dependencies | No | The paper mentions using optimizers like SGD and Adam, but does not provide specific version numbers for software libraries or frameworks (e.g., PyTorch, TensorFlow, CUDA) that were used to implement and run the experiments. |
| Experiment Setup | Yes | For MODNet, we train it by SGD for 40 epochs. With a batch size of 16, the initial learning rate is set to 0.01 and is multiplied by 0.1 after every 10 epochs. We set λs = λα = 1 and λd = 10. |