reproducibilityindex.ai

Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering

Authors: Yijun Dong, Kevin Miller, Qi Lei, Rachel Ward

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present experimental results on CIFAR-10/100 [Krizhevsky and Hinton, 2009] to demonstrate the efficacy of combining DAC and RKD (i.e., the local and global perspectives of clustering) for semi-supervised learning in the low-label-rate regime.
Researcher Affiliation	Academia	Yijun Dong Courant Institute of Mathematical Sciences New York University New York, NY yd1319@nyu.edu Kevin Miller Oden Institute for Computational Engineering & Science University of Texas at Austin Austin, TX ksmiller@utexas.edu Qi Lei Courant Institute of Mathematical Sciences & Center of Data Science New York University New York, NY ql518@nyu.edu Rachel Ward Oden Institute for Computational Engineering & Science University of Texas at Austin Austin, TX rward@math.utexas.edu
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	The experiment code can be found at https://github.com/dyjdongyijun/Semi_Supervised_Knowledge_Distillation.
Open Datasets	Yes	In this section, we present experimental results on CIFAR-10/100 [Krizhevsky and Hinton, 2009] to demonstrate the efficacy of combining DAC and RKD (i.e., the local and global perspectives of clustering) for semi-supervised learning in the low-label-rate regime.
Dataset Splits	No	The paper mentions 'The average and standard deviation of the best test accuracy (i.e., early stopping with the maximum patience 128) are reported', which implies a validation process, but it explicitly uses 'test accuracy' and does not specify a separate validation set or its size/split for early stopping or hyperparameter tuning.
Hardware Specification	Yes	Both CIFAR-10/100 experiments are conducted on one NVIDIA A40 GPU.
Software Dependencies	No	The paper mentions `pytorch_cifar10` in the GitHub link, but it does not provide specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or other dependencies.
Experiment Setup	Yes	Throughout the experiments, we used weight decay 0.0005. We train the student model via stochastic gradient descent (SGD) with Nesterov momentum 0.9 for 217 iterations (batches) with a batch size 64 8 = 29 (consisting of 64 labeled samples and 64 7 unlabeled samples). The initial learning rate is 0.03, decaying with a cosine scheduler. The test accuracies are evaluated... on an EMA model with a decay rate 0.999.