Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration
Authors: Yu Wang, Jingyang Lin, Jingjing Zou, Yingwei Pan, Ting Yao, Tao Mei
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show UOTA s advantage over the state-of-the-art self-supervised paradigms with evident margin, which well justifies the existence of the OOD sample issue embedded in the existing approaches. In this section, we empirically evaluate the effectiveness of the proposed UOTA algorithm and justify correctiveness of Theorem 1. |
| Researcher Affiliation | Collaboration | Yu Wang1 Jingyang Lin2 Jingjing Zou3 Yingwei Pan1 Ting Yao1 Tao Mei1 1 JD AI Research, Beijing, China 2 Sun Yat-sen University, Guangzhou, China 3 University of California, San Diego, USA |
| Pseudocode | Yes | According to the discussion above, the complete UOTA training procedure of wi,j is described as follows (see pseudocode in the supplementary material). |
| Open Source Code | Yes | Code is available: https://github.com/ssl-codelab/uota. |
| Open Datasets | Yes | All models pretrained for 200 epochs on Image Net100 [39]. More downstream task accuracy (i.e., object detection, instance segmentation and keypoint detection) on MS COCO dataset [29] |
| Dataset Splits | Yes | In this section, we choose Image Net100 training data as training set and Image Net1K val set (50,000 images) as test set. |
| Hardware Specification | Yes | We train all relevant algorithms on 4 V100 GPUs, with a batch size 256. the total training time for Sw AV+UOTA is 181.5 hours, with only negligible 2 hours more than Sw AV s training time 179.2 hours on 4 V100 GPUs. |
| Software Dependencies | No | The paper mentions optimizers (e.g., LARS, SGD) and network architectures (Res Net-18, Res Net-50) but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | On Image Net100, we pretrain all the approaches for 200 epochs with a batch size 128. We train all relevant algorithms on 4 V100 GPUs, with a batch size 256. For each X+UOTA model, we firstly train its baseline X for Nwarm warm up epochs, and then we resume the X+UOTA training till end... Total number of training epochs of X+UOTA including warm up is the same as that for training X. |