reproducibilityindex.ai

Object-aware Contrastive Learning for Debiased Scene Representation

Authors: Sangwoo Mo, Hyunwoo Kang, Kihyuk Sohn, Chun-Liang Li, Jinwoo Shin

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate the effectiveness of our representation learning framework, particularly when trained under multi-object images or evaluated under the background (and distribution) shifted images.3 Experiments We ﬁrst verify the localization performance of Contra CAM in Section 3.1. We then demonstrate the efﬁcacy of our debiased contrastive learning: object-aware random crop improves the training under multi-object images by reducing contextual bias in Section 3.2, and background mixup improves generalization on background and distribution shifts by reducing background bias in Section 3.3.
Researcher Affiliation	Collaboration	Sangwoo Mo 1, Hyunwoo Kang 1, Kihyuk Sohn2, Chun-Liang Li2, Jinwoo Shin1 1KAIST 2Google Cloud AI {swmo,hyunwookang,jinwoos}@kaist.ac.kr, {kihyuks,chunliang}@google.com
Pseudocode	Yes	We provide the pseudo-code of the entire Iterative Contra CAM procedure in Appendix A.
Open Source Code	Yes	Code is available at https://github.com/alinlab/object-aware-contrastive.
Open Datasets	Yes	We train the models for 800 epochs on COCO [25] and Image Net-9 [23], and 2,000 epochs on CUB [46] and Flowers [26] datasets with batch size 256.
Dataset Splits	No	The paper mentions training and testing on datasets like COCO, Flowers, CUB, Image Net-9, CIFAR-10, CIFAR-100, Food, and Pets, and evaluating via linear evaluation. However, explicit training/validation/test splits (percentages, sample counts, or references to predefined splits for all datasets used) are not provided in the main text. For example, for linear evaluation, it mentions training a linear classifier "on top of the learned representation" using the ORIGINAL dataset for Background Challenge, but not specific splits like 80/10/10.
Hardware Specification	Yes	The training of the baseline models on the COCO ( 100,000 samples) dataset takes 1.5 days on 4 GPUs and 3 days on 8 GPUs for Res Net-18 and Res Net-50 architectures, respectively, using a single machine with 8 Ge Force RTX 2080 Ti GPUs; proportional to the number of samples and training epochs for other cases.
Software Dependencies	No	We apply the conditional random ﬁeld (CRF) using the default hyperparameters from the pydensecrf library [49] to produce segmentation masks and use the opencv [50] library to extract bounding boxes.
Experiment Setup	Yes	We train the models for 800 epochs on COCO [25] and Image Net-9 [23], and 2,000 epochs on CUB [46] and Flowers [26] datasets with batch size 256.We follow the default hyperparameters of Mo Cov2 and BYOL, except the smaller minimum random crop scale of 0.08 (instead of the original 0.2) since it performed better, especially for the multi-object images.