Knowledge Amalgamation from Heterogeneous Networks by Common Feature Learning

Authors: Sihui Luo, Xinchao Wang, Gongfan Fang, Yao Hu, Dapeng Tao, Mingli Song

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test the proposed approach on a list of benchmarks and demonstrate that the learned student is able to achieve very promising performance, superior to those of the teachers in their specialized tasks. Experimental results on a list of classification datasets demonstrate the learned student outperforms the teachers in their corresponding specialities.
Researcher Affiliation Collaboration Sihui Luo1 , Xinchao Wang2 , Gongfan Fang1 , Yao Hu3 , Dapeng Tao4 and Mingli Song1 1Zhejiang University 2Stevens Institute of Technology 3Alibaba Group 4Yunnan University
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating the release of open-source code for the described methodology.
Open Datasets Yes We test the proposed method on a list of classification datasets summarized in Tab. 1. Given a dataset, we pre-trained the teacher network against the one-hot image-level labels in advance over the dataset using the cross-entropy loss. In face recognition case, we employ CASIA webface [Yi et al., 2014] or MS-Celeb-1M as the training data
Dataset Splits Yes During training, we explore face verification datasets including LFW [Huang et al., 2008], CFP-FP [Sengupta et al., 2016], and Age DB-30 [Moschoglou et al., 2017] as the validation set.
Hardware Specification Yes We implement our method using Py Torch [He et al., 2016] on a Quadro P5000 16G GPU.
Software Dependencies No The paper mentions 'Py Torch' but does not specify a version number, which is required for reproducible software dependencies.
Experiment Setup Yes An Adam [Kingma and Ba, 2014] optimizer is utilized to train the student network. The learning rate is e 4, while the batch size is 128 on classification datasets and 64 on face recognition ones.