reproducibilityindex.ai

Robust Knowledge Transfer via Hybrid Forward on the Teacher-Student Model

Authors: Liangchen Song, Jialian Wu, Ming Yang, Qian Zhang, Yuan Li, Junsong Yuan2558-2566

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of our method on a variety of tasks, e.g, model compression, segmentation, and detection, under a variety of knowledge transfer settings. ... To validate the effectiveness of our proposed method, we conduct experiments on four different settings, as introduced in the introduction of the paper (Fig. 1).
Researcher Affiliation	Collaboration	1University at Buffalo 2Horizon Robotics 3Google {lsong8,jialianw,jsyuan}@buffalo.edu, m-yang4@u.northwestern.edu, qian01.zhang@horizon.ai, liyu@google.com
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We begin our experiments with the gap between the teacher and the student: the model gap. Such a setting is also known as the distillation based model compression. In the experiments, a teacher network that is well-trained on Image Net (Deng et al. 2009) will be used to guide a shallower student network. ... The GTA5 dataset has 24966 images and we randomly select 500 images out as the validation set for training the teacher network. Apart from the above, to better investigate on which gap is more challenging, we employ a multi-task setting on the City Scapes dataset. ... The dataset for the student network is City Persons (Zhang, Benenson, and Schiele 2017), which uses the images from City Scapes and the pedestrian are manually relabeled.
Dataset Splits	Yes	The GTA5 dataset has 24966 images and we randomly select 500 images out as the validation set for training the teacher network. ... Since we are interested in whether the knowledge from the teacher can help the student, we ﬁrst present results on the City Scapes validation set with the above two teachers. ... In Tab. 3, we show the results on the GTA5 validation set, which is split previously to acquire a teacher network on the GTA5. ... The results on City Persons validation set are presented in Tab. 4.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU models, CPU types, or memory) used for running the experiments. It only mentions general concepts like 'mobile phone' or 'server' in a theoretical context.
Software Dependencies	No	The paper mentions 'Torchvision' and 'PyTorch' (implicitly, as it's common for deep learning), but it does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	For training hyper-parameters, we use the same parameters as (Heo et al. 2019a): Batch size is set to 256; Learning rate is initialized with 0.1 and decay by 0.1 every 30 epochs. ... we use the same training hyper-parameters: batch size of 8, 40000 iterations and learning rate starting from 1e-3 with polynomial decay.