Partial Label Learning with Batch Label Correction

Authors: Yan Yan, Yuhong Guo6575-6582

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are conducted on synthesized and real-world partial label learning datasets, while the proposed approach demonstrates the state-of-the-art performance for partial label learning.
Researcher Affiliation Academia 1School of Computer Science and Engineering, Northwestern Polytechnical University, China 2School of Computer Science, Carleton University, Canada
Pseudocode Yes Algorithm 1 Training Algorithm of PL-BLC.
Open Source Code No The paper does not provide an explicit statement or a link for open-source code availability.
Open Datasets Yes We conducted controlled experiments on synthetic PL datasets constructed from 8 UCI datasets shown in Table 1. ... We also conducted experiments on six real-world PL datasets, which are collected from several application domains, including FG-NET (Panis and Lanitis 2014) for facial age estimation, Lost (Cour, Sapp, and Taskar 2011), Soccer Player (Zeng et al. 2013) and Yahoo! News (Guillaumin, Verbeek, and Schmid 2010) for automatic face naming from images or videos; MSRCv2 (Dietterich and Bakiri 1994) for object classification, and Bird Song (Briggs, Fern, and Raich 2012) for bird song classification.
Dataset Splits Yes On each PL dataset, we performed ten-fold cross-validation and report the average test accuracy results.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Adam (Kingma and Ba 2014) optimizer' and 'Leaky Re Lu activation function' but does not provide specific version numbers for software libraries or dependencies.
Experiment Setup Yes The Adam (Kingma and Ba 2014) optimizer is adopted for training and the mini-batch size m is set to 32. We set α, β in Eq.(5) to 0.5 and 1 respectively. The learning rate, sharpening temperature T and the number of training iterations in Algorithm 1 are set to 0.0002, 0.4 and 100 n/32 respectively. We selected the hyperparameter η from {0.001, 0.01, 0.1, 0.3, 0.5, 1, 10}. In terms of the EMA decay γ, we used γ = 0.99 during the ramp-up phase (for the first 20 n/32 iterations in our experiment), and γ = 0.999 for the rest of training, since the student model improves quickly in the early phase.