reproducibilityindex.ai

Spatial Ensemble: a Novel Model Smoothing Mechanism for Student-Teacher Framework

Authors: Tengteng Huang, Yifan Sun, Xun Wang, Haotian Yao, Chi Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the effectiveness of the proposed Spatial-Temporal Smoothing by applying it to the stateof-the-art self-supervised approaches (Mo Co [1] and BYOL [2]) and semi-supervised method (Fix Match [7]). We use the ofﬁcial implementation of Mo Co2 and re-implement BYOL and Fix Match using Pytorch [30]. All experiments are conducted on a machine with 8 RTX2080 GPUs.
Researcher Affiliation	Collaboration	Megvii Technology {huangtengteng, yaohaotian, zhangchi}@megvii.com, sunyf15@tsinghua.org.cn, bnuwangxun@gmail.com
Pseudocode	Yes	Algorithm 1 Pseudocode of SE. Algorithm 2 Pseudo code of STS.
Open Source Code	Yes	Codes and models are available at: https://github.com/tengteng95/Spatial_ Ensemble.
Open Datasets	Yes	All the ablation experiments are conducted on the Image Net dataset [36] and trained for 200 epochs unless noted otherwise. ... We use CIFAR-10 and Ci FAR-100 as the benchmark datasets.
Dataset Splits	No	The paper mentions common benchmark datasets like ImageNet and CIFAR-10/100 and states it follows the training and evaluation settings of original papers, but does not explicitly provide specific train/validation/test dataset split percentages or sample counts within its own text.
Hardware Specification	Yes	All experiments are conducted on a machine with 8 RTX2080 GPUs.
Software Dependencies	No	The paper mentions using 'Pytorch' but does not specify its version number or any other software dependencies with their respective version numbers.
Experiment Setup	Yes	The masking probability p is set to 0.7/0.5/0.5 for BYOL/Mo Co/Fix Match, respectively. ... The initial learning rate is set to 0.03 and adjusted by cosine learning rate scheduler [32]. Following the original paper, we train the model using SGD with momentum of 0.9, weight decay of 0.0001, and a mini-batchsize of 256.