reproducibilityindex.ai

PiCO: Contrastive Label Disambiguation for Partial Label Learning

Authors: Haobo Wang, Ruixuan Xiao, Yixuan Li, Lei Feng, Gang Niu, Gang Chen, Junbo Zhao

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that Pi CO signiﬁcantly outperforms the current state-of-the-art approaches in PLL and even achieves comparable results to fully supervised learning.
Researcher Affiliation	Academia	1Zhejiang University 2University of Wisconsin-Madison 3Chongqing University 4RIKEN
Pseudocode	Yes	C PSEUDO-CODE OF PICO
Open Source Code	Yes	Code and data available: https://github.com/hbzju/Pi CO.
Open Datasets	Yes	First, we evaluate Pi CO on two commonly used benchmarks CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009).
Dataset Splits	Yes	Following the standard experimental setup in PLL (Feng et al., 2020b; Wen et al., 2021), we split a clean validation set (10% of training data) from the training set to select the hyperparameters.
Hardware Specification	Yes	We train the models using one Quadro P5000 GPU respectively and evaluate the average training time per epoch.
Software Dependencies	No	The paper mentions using an '18-layer Res Net' and 'Sim Augment' and 'Rand Augment' for data augmentation, but it does not specify software versions for frameworks (like PyTorch or TensorFlow), libraries, or specific ResNet implementations.
Experiment Setup	Yes	The projection head of the contrastive network is a 2-layer MLP that outputs 128-dimensional embeddings. We use two data augmentation modules Sim Augment (Khosla et al., 2020) and Rand Augment (Cubuk et al., 2019) for query and key data augmentation respectively. [...] The size of the queue that stores key embeddings is ﬁxed to be 8192. The momentum coefﬁcients are set as 0.999 for contrastive network updating and γ = 0.99 for prototype calculation. For pseudo target updating, we linearly ramp down φ from 0.95 to 0.8. The temperature parameter is set as τ = 0.07. The loss weighting factor is set as λ = 0.5. The model is trained by a standard SGD optimizer with a momentum of 0.9 and the batch size is 256. We train the model for 800 epochs with cosine learning rate scheduling. We also empirically ﬁnd that classiﬁer warm-up leads to better performance when there are many candidates. Hence, we disable contrastive learning in the ﬁrst 100 epoch for CIFAR-100 with q = 0.1 and 1 epoch for the remaining experiments.