reproducibilityindex.ai

Semi-Supervised Learning under Class Distribution Mismatch

Authors: Yanbei Chen, Xiatian Zhu, Wei Li, Shaogang Gong3569-3576

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide extensive benchmarking results in this realistic SSL scenario, including our proposed UASD and six representative state-of-the-art SSL methods on three image classiﬁcation datasets: CIFAR10, CIFAR100 and Tiny Image Net. Remarkably, UASD outperforms all the strong competitors often by large margins, and demonstrates great potential to exploit the unconstrained unlabelled data.
Researcher Affiliation	Collaboration	Yanbei Chen,1 Xiatian Zhu,2 Wei Li,1 Shaogang Gong1 1Queen Mary University of London, 2Vision Semantics Ltd.
Pseudocode	Yes	Algorithm 1 Uncertainty-Aware Self-Distillation (UASD) Require: Labelled data Dl = {xi,l, yi}Nl i=1. Unlabelled data Du = {xi,u}Nu i=1. Require: Trainable neural network θ. Ramp-up weighting function w(t). for t = 1 to max epoch do Refresh conﬁdence threshold τt per epoch. for k = 1 to max iter per epoch do Forward propagation to accumulate network prediction qt(y\|xi) (Eq (1)) for every in-batch sample. Apply OOD ﬁltering (Eq (2), (3)). Update network parameters θ with loss function Eq (4). end for end for
Open Source Code	No	The paper states, 'For a comprehensive and fair comparison, our experiments are built upon the open-source Tensorﬂow implementation by Oliver et al. (Oliver et al. 2018).' This refers to a base implementation they used, not an explicit statement that their own UASD code is open source or provided.
Open Datasets	Yes	Datasets. We use three image classiﬁcation benchmark datasets. (1) CIFAR10: A natural image dataset with 50,000/10,000 training/test samples from 10 object classes. (2) CIFAR100: A dataset of 100 ﬁne-grained classes, with the same amount of training/test samples as CIFAR10. (3) Tiny Image Net: A subset of Image Net (Deng et al. 2009) with 200 classes, each of which has 500/50 training/validation images.
Dataset Splits	Yes	Thus, we dynamically estimate τt in a data-driven manner by using the validation set (10% of training data) of known classes as reference.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory specifications, or cloud instance types used for running experiments.
Software Dependencies	No	The paper mentions building upon 'the open-source Tensorﬂow implementation by Oliver et al. (Oliver et al. 2018)' but does not specify a version number for TensorFlow or any other software dependencies.
Experiment Setup	Yes	Implementation details. For a comprehensive and fair comparison, our experiments are built upon the open-source Tensorﬂow implementation by Oliver et al. (Oliver et al. 2018). It uses the standard Wide Res Net (Zagoruyko and Komodakis 2016), i.e. WRN-28-2, as the base network and Adam optimiser (Kingma and Ba 2014) for training. We revise the default 10-dimensional classiﬁcation layer to K-dimension, where K is the number of known classes in the labelled data. Unless stated otherwise, all hyper-parameters, the ramp-up function, and training procedures are the same as that of (Oliver et al. 2018).