Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
MaskFactory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation
Authors: Haotian Qian, Yinda Chen, Shengtao Lou, Fahad Shahbaz Khan, Xiaogang Jin, Deng-Ping Fan
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on the widely-used DIS5K dataset benchmark demonstrate superior performance in quality and efficiency compared to existing methods. |
| Researcher Affiliation | Academia | 1State Key Lab of CAD&CG, Zhejiang University 2VCIP&CS, Nankai University 3MBZUAI 4Linköping University |
| Pseudocode | Yes | Appendix A Pseudocode for the Mask Factory Algorithm. Algorithm 1: Mask Factory Algorithm |
| Open Source Code | Yes | The code is available at https: //qian-hao-tian.github.io/Mask Factory/. |
| Open Datasets | Yes | We conduct our experiments on the DIS5K dataset, which comprises 5,479 high-resolution images... The DIS5K dataset is divided into three subsets: DIS-TR (3,000 images) for training, DIS-VD (470 images) for validation, and DIS-TE (2,000 images) for testing. |
| Dataset Splits | Yes | The DIS5K dataset is divided into three subsets: DIS-TR (3,000 images) for training, DIS-VD (470 images) for validation, and DIS-TE (2,000 images) for testing. |
| Hardware Specification | Yes | Our image editing framework is implemented using Py Torch and trained on 8 NVIDIA Ge Force RTX 3090 GPUs. We train the segmentation model using the DIS-TR subset of the DIS5K dataset, utilizing 2 NVIDIA Ge Force RTX 3090 GPUs. |
| Software Dependencies | No | Only |
| Experiment Setup | Yes | The hyperparameters used in our model are as follows: a batch size of 16, an image size of 512x512, 5 editing iterations, a learning rate of 0.001, a weight decay of 0.0001, 1000 diffusion steps, and a diffusion step size of 0.1. The hyperparameters in Equ 3 are set to λ1 = 0.8 and λ2 = 0.5. ... The input size to the network is 512x512, with a learning rate of 0.0001 and a batch size of 48. The model is optimized using the Adam optimizer over a total of 800 epochs. |