Unsupervised Foreground Extraction via Deep Region Competition

Authors: Peiyu Yu, Sirui Xie, Xiaojian Ma, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that DRC exhibits more competitive performances on complex real-world data and challenging multi-object scenes compared with prior methods.1
Researcher Affiliation Collaboration 1UCLA Department of Computer Science 2UCLA Department of Statistics 3Beijing Institute for General Artificial Intelligence (BIGAI)
Pseudocode Yes Algorithm 1: Learning models of DRC via EM.
Open Source Code Yes 1Code and data available at https://github.com/yuPeiyu98/DRC
Open Datasets Yes Caltech-UCSD Birds-200-2011 (Birds) [76], Stanford Dogs (Dogs) [77], and Stanford Cars (Cars) [78] datasets; (2) CLEVR6 [79] and Textured Multi-d Sprites (TM-d Sprites) [80] datasets.
Dataset Splits No To evaluate our method, we split the Birds dataset following Chen et al. [28], resulting in 10K training images and 1K testing images. On Dogs and Cars datasets, we split the dataset based on the original train-test split [77, 78]. This split gives 3,286 dog images and 6,218 car images for training, and 1,738 dog images and 6,104 car images for testing, respectively.
Hardware Specification No The paper states: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See supplemental material.' This indicates hardware details are provided, but they are deferred to the supplementary material and not explicitly described in the main paper.
Software Dependencies No The paper mentions using PyTorch and various techniques like batch-norm, instance-norm, orthogonal initialization, and Total-Variation norm, but it does not specify explicit version numbers for any software dependencies (e.g., 'PyTorch 1.9' or 'CUDA 11.1').
Experiment Setup No The paper describes some architectural and training choices like 'five stacked upsample-conv-norm layers', '3-layered MLP', 'orthogonal initialization', and 'Total-Variation norm'. However, it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) in the main text, stating that 'More details... are provided in the supplementary material.'