UCoL: Unsupervised Learning of Discriminative Facial Representations via Uncertainty-Aware Contrast
Authors: Hao Wang, Min Li, Yangyang Song, Youjian Zhang, Liying Chi
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that UCo L significantly improves the baselines of unsupervised models and performs on par with the semi-supervised and supervised face representation learning methods. |
| Researcher Affiliation | Collaboration | Hao Wang*1, Min Li*1, Yangyang Song1, Youjian Zhang2, Liying Chi1 1Byte Dance Inc. 2The University of Sydney |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | MS-Celeb-1M (Guo et al. 2016) dataset is adopted as the training dataset, and the ground-truth labels are eliminated under the unsupervised setting. We adopt the version of MS1M-Retina Face (Deng et al. 2019b), which consists of 5.1M images from 93K classes. All the face images are preprocessed with alignment and cropping in the same way with arcface (Deng et al. 2019a). To demonstrate the effectiveness of the learned facial representation, we conduct experiments on several standard face recognition benchmarks, including LFW (Huang and Learned-Miller 2014), Mega Face (Kemelmacher Shlizerman et al. 2016) and LJB-C (Nech and Kemelmacher-Shlizerman 2017), to test the face verification accuracy. |
| Dataset Splits | No | The paper identifies MS-Celeb-1M as the training dataset and LFW, Mega Face, and IJB-C as benchmark datasets for testing. However, it does not specify a separate validation split for the training data or details on how validation was performed to tune hyperparameters. |
| Hardware Specification | Yes | Our models are trained for 20 epochs on 8 Tesla V100s, with a batch size of 512, learning rate of 0.004, and weight decay of 5 10 4. |
| Software Dependencies | No | The paper mentions using the Adam W optimizer and Vision Transformers but does not specify version numbers for any software libraries, frameworks, or programming languages. |
| Experiment Setup | Yes | Our training approach starts with 5 epochs of linear warmup and employs a cosine learning rate decay. For improved stability, we utilize the same setting of layerwise learning rate decay as Mo Co-v3. Our models are trained for 20 epochs on 8 Tesla V100s, with a batch size of 512, learning rate of 0.004, and weight decay of 5 10 4. During the first 4 epochs, we exclusively use intra-instance contrastive learning (λ = 0), then introduce inter-instance selflabeled pairs by increasing the coefficient λ to 0.5. To perform margin-based Info NCE loss, we set positive and dictionary queues to sizes 512 and 204, 800, respectively, with hyper-parameters τ = 1/80 and m = 0.3. |