Semi-supervised Learning with Multi-Head Co-Training
Authors: Mingcai Chen, Yuntao Du, Yi Zhang, Shuwei Qian, Chongjun Wang6278-6286
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The effectiveness of Multi Head Co-Training is demonstrated in an empirical study on standard semi-supervised learning benchmarks. [...] Experiments The number and structure of heads in our framework can be arbitrary, but we set the head number as three and the head structure as the last residual block (Zagoruyko and Komodakis 2016) in most of the experiments. Results We benchmark the proposed method on experimental settings using CIFAR-10 (Krizhevsky and Hinton 2009), CIFAR-100 (Krizhevsky and Hinton 2009), and SVHN (Netzer et al. 2011) as the standard practice. |
| Researcher Affiliation | Academia | State Key Laboratory for Novel Software Technology at Nanjing University Nanjing University, Nanjing 210023, China {chenmc, duyuntao, njuzhangy, Qian SW}@smail.nju.edu.cn, chjwang@nju.edu.cn |
| Pseudocode | Yes | The overall algorithm is shown in Algorithm 1. (referring to 'Algorithm 1: Multi-Head Co-Training with three heads.') |
| Open Source Code | No | The paper does not contain any explicit statement or link indicating that the source code for the methodology is openly available. |
| Open Datasets | Yes | We benchmark the proposed method on experimental settings using CIFAR-10 (Krizhevsky and Hinton 2009), CIFAR-100 (Krizhevsky and Hinton 2009), and SVHN (Netzer et al. 2011) as the standard practice. [...] We further evaluate our model on the more complex dataset Mini-Image Net (Vinyals et al. 2016), which is a subset Image Net (Deng et al. 2009). |
| Dataset Splits | Yes | Different portions of labeled data ranging from 0.5% to 20% are experimented. [...] We randomly choose 250-4000 from 50000 training images of CIFAR-10 as labeled examples. Other images labels are thrown away. [...] The training set of Mini-Image Net consists of 50000 images with a size of 84 × 84 in 10 object classes. We randomly choose 4000 and 10000 images as labeled examples and throw other’s label information. [...] early stopped using the Image Net validation set. |
| Hardware Specification | No | The paper does not explicitly state specific hardware details such as GPU models, CPU models, or memory used for running the experiments. It only mentions 'growing computing power'. |
| Software Dependencies | No | The paper mentions software components like 'SGD with Nesterov momentum' and uses 'WRN 28-2' or 'WRN 28-8' as backbones, but it does not specify version numbers for any programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | SGD with Nesterov momentum (Sutskever et al. 2013) is used, along with weight decay and cosine learning rate decay (Loshchilov and Hutter 2016). The details are in the supplementary material. [...] In every training iteration of Multi-Head Co-Training, a batch of labeled examples Dl = {(xb, yb); b (1, . . . , Bl)} and a batch of unlabeled examples Du = {(ub); b (1, . . . , Bu)} are randomly sampled from the labeled and unlabeled dataset respectively. |