reproducibilityindex.ai

Hierarchical Self-supervised Augmented Knowledge Distillation

Authors: Chuanguang Yang, Zhulin An, Linhang Cai, Yongjun Xu

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct evaluations on standard CIFAR-100 and Image Net [Deng et al., 2009] benchmarks across the widely applied network families including Res Net [He et al., 2016], WRN [Zagoruyko S, 2016], VGG [Simonyan and Zisserman, 2015], Mobile Net [Sandler et al., 2018] and Shufﬂe Net [Zhang et al., 2018; Ma et al., 2018]. Some representative KD methods including KD [Hinton et al., 2015], Fit Net [Romero et al., 2015], AT [Zagoruyko and Komodakis, 2017], AB [Heo et al., 2019], VID [Ahn et al., 2019], RKD [Park et al., 2019], SP [Tung and Mori, 2019], CC [Peng et al., 2019], CRD [Tian et al., 2020] and SOTA SSKD [Xu et al., 2020] are compared. For a fair comparison, all comparative methods are combined with conventional KD by default, and we adopt rotations {0 , 90 , 180 , 270 } as the self-supervised auxiliary task as same as SSKD. We use the standard training settings following [Xu et al., 2020] and report the mean result with a standard deviation over 3 runs.
Researcher Affiliation	Academia	1Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China
Pseudocode	No	The paper describes methods in text and uses figures but does not contain a formal pseudocode or algorithm block.
Open Source Code	Yes	Codes are available at https://github.com/winycg/HSAKD.
Open Datasets	Yes	We conduct evaluations on standard CIFAR-100 and Image Net [Deng et al., 2009] benchmarks
Dataset Splits	No	The paper states "We use the standard training settings following [Xu et al., 2020]" but does not explicitly provide specific train/validation/test dataset splits (percentages or sample counts) or clearly cite where these splits are defined for reproduction within this paper.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers, such as specific deep learning frameworks or libraries.
Experiment Setup	Yes	We use the standard training settings following [Xu et al., 2020] and report the mean result with a standard deviation over 3 runs. The more detailed settings for reproducibility can be found in our released codes. Following the wide practice, we set the hyper-parameter τ = 1 in task loss and τ = 3 in mimicry loss. Besides, we do not introduce other hyper-parameters.