IMF: Integrating Matched Features Using Attentive Logit in Knowledge Distillation
Authors: Jeongho Kim, Hanbeen Lee, Simon S. Woo
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we demonstrate that IMF consistently outperforms other state-of-the-art methods with a large margin over the various datasets in different tasks without extra computation. |
| Researcher Affiliation | Collaboration | Jeongho Kim1 , Hanbeen Lee2 , Simon S. Woo3 1Korea Advanced Institute of Science and Technology (KAIST), S. Korea 2NAVER Z Corporation, S. Korea 3Department of Artificial Intelligence, Sungkyunkwan University, S. Korea |
| Pseudocode | No | The paper includes mathematical equations and descriptive text for the method, but no explicit pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | For example, in CIFAR100 [Krizhevsky and Hinton, 2009], which has 100 classes, 100 parameters are added to the output logits of each IFD layer. |
| Dataset Splits | Yes | Training details. Backbone architecture and training settings for experiments are similar to the recent research [Tian et al., 2019]. |
| Hardware Specification | No | The paper mentions parameters and FLOPs for model size comparison but does not specify the hardware (e.g., GPU or CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper refers to architecture components like 'Depth Conv(3x3) Point Conv(1x1) BN & Re LU' but does not list any specific software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | Training details. Backbone architecture and training settings for experiments are similar to the recent research [Tian et al., 2019]. In our method, we conduct a grid search to choose the α and β values in Eq. 5 from {10, 20, 30, 40}. The IFD block has the same structure in all experiments and model architectures. Specifically, we used a block structure of Depth Conv(3 3) Point Conv(1 1) BN & Re LU. |