Hierarchical Knowledge Squeezed Adversarial Network Compression

Authors: Peng Li, Chang Shu, Yuan Xie, Yan Qu, Hui Kong11370-11377

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on three typical benchmark datasets, i.e., CIFAR-10, CIFAR-100, and Image Net, demonstrate that our method achieves highly superior performances against state-of-the-art methods.
Researcher Affiliation Collaboration Peng Li,1,2 Changyong Shu,1,2 Yuan Xie,1 Yanyun Qu,3 Hui Kong2 1School of Computer Science and Technology, East China Normal University, Shanghai, China 2Nanjing Institute of Advanced Artificial Intelligence, Horizon Robotic, Nanjing, China 3Fujian Key Laboratory of Sensing and Computing for Smart City, School of Information Science and Engineering, Xiamen University, Fujian, China
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any links to open-source code or explicit statements about its availability.
Open Datasets Yes We consider three image classification datasets: CIFAR-10, CIFAR-100, and Image Net ILSVRC 2012. Both CIFAR-10 and CIFAR-100 contain 50K training images and 10K validation images, respectively. The Image Net ILSVRC 2012 contains more than 1 million training images from 1000 object categories and 20K validation images with each category including 20 images.
Dataset Splits Yes Both CIFAR-10 and CIFAR-100 contain 50K training images and 10K validation images, respectively. The Image Net ILSVRC 2012 contains more than 1 million training images from 1000 object categories and 20K validation images with each category including 20 images. For all experiments, we train on the standard training set and test on the validation set.
Hardware Specification Yes Our implementation is based on Pytorch, with 1 and 4 NVIDIA GTX 1080ti GPU for CIFAR-10/100 and Image Net, separately.
Software Dependencies No The paper states 'Our implementation is based on Pytorch', but it does not specify the version number of Pytorch or any other software dependencies.
Experiment Setup Yes For CIFAR-10 and CIFAR-100, we set the pre-trained teacher as Res Net-164, the student as Res Net-20... We select the minibatch size as 64 and total train epoch as 600 with the learning rate multiplied by 0.1 at epoch 240 and epoch 480. We use Stochastic Gradient Descent (SGD) with momentum as the optimizer, and set the momentum as 0.9, weight decay as 1e 4. The learning rate, initialized as 1e 1 and 1e 3 for student and discriminator separately...