Knowledge Consistency between Neural Networks and Beyond

Authors: Ruofan Liang, Tianlin Li, Longfei Li, Jing Wang, Quanshi Zhang

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In preliminary experiments, we have used knowledge consistency as a tool to diagnose representations of neural networks. Knowledge consistency provides new insights to explain the success of existing deep-learning techniques, such as knowledge distillation and network compression. More crucially, knowledge consistency can also be used to refine pre-trained networks and boost performance.
Researcher Affiliation Collaboration Ruofan Liang ,a, Tianlin Li ,a, Longfei Lia, Jing Wangb, and Quanshi Zhanga a Shanghai Jiao Tong University, b Huawei Technologies
Pseudocode No The paper includes mathematical equations (e.g., Equation 1, 3, 4) and a network diagram (Figure 2), but it does not contain any formally structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about releasing source code or provide links to a code repository for the methodology described.
Open Datasets Yes A total of five typical DNNs for image classification were used, i.e. the Alex Net (Krizhevsky et al., 2012), the VGG-16 (Simonyan & Zisserman, 2015), and the Res Net-18, Res Net-34, Res Net-50 (He et al., 2016). These DNNs were learned using three benchmark datasets, which included the CUB200-2011 dataset (Wah et al., 2011), the Stanford Dogs dataset (Khosla et al., 2011), and the Pascal VOC 2012 dataset (Everingham et al., 2015).
Dataset Splits No The paper mentions training DNNs and evaluating performance with 'top-1 accuracy', which implies data splits. It also mentions dividing training samples into subsets in Section 4.2 ('We randomly divided all training samples in the CUB200-2011 dataset (Wah et al., 2011) into two subsets, each containing 50% samples'). However, it does not explicitly provide percentages or absolute counts for comprehensive train/validation/test splits for the main experiments, nor does it explicitly state the use of standard predefined splits for the benchmark datasets.
Hardware Specification No The paper discusses training and evaluating deep neural networks, which requires computational hardware, but it does not specify any particular hardware components such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies No The paper refers to common deep learning frameworks and libraries like 'MLP', 'CNN', 'PyTorch', and 'Caffe', but it does not provide specific version numbers for any of these or other software dependencies.
Experiment Setup Yes We set λ = 0.1 for all experiments, except for feature reconstruction of Alex Net (we set λ = 8.0 for Alex Net features). ... All DNNs were learned without data augmentation or pre-training. ... We trained DNNs for fine-grained classification using the CUB200-2011 dataset (Wah et al., 2011) (without data augmentation). ... we applied the distillation loss in (Hinton et al., 2014) following parameter settings τ = 1.0 in (Mishra & Marr, 2018), i.e. Loss = Lossclassify + 0.5Lossdistill.