Learning Weakly-supervised Contrastive Representations
Authors: Yao-Hung Hubert Tsai, Tianqin Li, Weixin Liu, Peiyuan Liao, Ruslan Salakhutdinov, Louis-Philippe Morency
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical experiments suggest the following three contributions. First, compared to conventional self-supervised representations, the auxiliary-information-infused representations bring the performance closer to the supervised representations, which use direct downstream labels as supervision signals. Second, our approach performs the best in most cases, when comparing our approach with other baseline representation learning methods that also leverage auxiliary data information. Third, we show that our approach also works well with unsupervised constructed clusters (e.g., no auxiliary information), resulting in a strong unsupervised representation learning approach. |
| Researcher Affiliation | Academia | Yao-Hung Hubert Tsai1 , Tianqin Li1 , Weixin Liu1, Peiyuan Liao1, Ruslan Salakhutdinov1, Louis-Philippe Morency1 1Carnegie Mellon University {yaohungt, tianqinl, weixinli, peiyuanl, rsalakhu, morency}@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1: K-means Clusters + Cl-Info NCE Result: Pretrained Encoder fθ( ) fθ( ) Base Encoder Network; Aug ( ) Obtaining Two Variants of Augmented Data via Augmentation Functions; Embedding Gathering data representations by passing data through fθ( ); Clusters K-means-clustering(Embedding); for epoch in 1,2,...,N do for batch in 1,2,...,M do data1, data2 Aug(data_batch); feature1, feature2 fθ(data1), fθ(data2); LCl-info NCE Cl-Info NCE(feature1, feature2, Clusters); fθ fθ lr θLCl-info NCE; end Embedding gather embeddings for all data through fθ( ); Clusters K-means-clustering(Embedding); end |
| Open Source Code | Yes | Code available at: https://github.com/Crazy-Jack/Cl-Info NCE. |
| Open Datasets | Yes | We conduct experiments on learning visual representations using UT-zappos50K (Yu & Grauman, 2014), CUB-200-2011 (Wah et al., 2011), Wider Attribute (Li et al., 2016) and Image Net-100 (Russakovsky et al., 2015) datasets. |
| Dataset Splits | Yes | We randomly split train-validation images by 7 : 3 ratio, resulting in 35, 017 train data and 15, 008 validation dataset. (UT-Zappos50K) The dataset comes with its training, validation, and test split. Due to a small number of data, we combine the original training and validation set as our training set and use the original test set as our validation set. The resulting training set contains 6, 871 images and the validation set contains 6, 918 images. (Wider Attribute) We follow the original train-validation split, resulting in 5, 994 train images and 5, 794 validation images. (CUB-200-2011) The training split contains 128, 783 images and the test split contains 5, 000 images. (Image Net-100) |
| Hardware Specification | Yes | We conduct experiments on machines with 4 NVIDIA Tesla P100. |
| Software Dependencies | No | The paper mentions using ResNet-50 as the feature encoder but does not specify software dependencies like Python, PyTorch, or CUDA versions. |
| Experiment Setup | Yes | We choose SGD with momentum of 0.95 for optimizer with a weight decay of 0.0001 to prevent network over-fitting. To allow stable training, we employ a linear warm-up and cosine decay scheduler for learning rate. For experiments shown in Figure 4 (a) in the main text, the learning rate is set to be 0.17 and the temperature is chosen to be 0.07 in Cl-Info NCE. And for experiments shown in Figure 5 in the main text, learning rate is set to be 0.1 and the temperature is chosen to be 0.1 in Cl-Info NCE. |