Weighted Mutual Learning with Diversity-Driven Model Compression
Authors: Miao Zhang, Li Wang, David Campos, Wei Huang, Chenjuan Guo, Bin Yang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show the generalization of the proposed framework, which outperforms existing online distillation methods on a variety of deep neural networks. |
| Researcher Affiliation | Academia | 1Harbin Institute of Technology (Shenzhen) 2Aalborg University 3UNSW 4East China Normal University 5Shandong First Medical University |
| Pseudocode | Yes | Algorithm 1 Weighted Mutual Learning (WML) for Online Distillation |
| Open Source Code | No | The paper does not provide an explicit statement or a direct link to its source code within the main body of the paper. While a checklist at the end indicates "code... Yes", this is not part of the scientific narrative and lacks a concrete access point. |
| Open Datasets | Yes | We perform a series of experiments to evaluate the effectiveness of our framework on seven convolutional neural networks with three image classification datasets... CIFAR10, CIFAR100, and Image Net. |
| Dataset Splits | No | The paper mentions using standard datasets like CIFAR10, CIFAR100, and Image Net, but it does not explicitly provide specific training, validation, or test split percentages or sample counts needed to reproduce the data partitioning. It discusses models and pruning ratios but not the dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run its experiments, such as CPU models, GPU models (e.g., NVIDIA A100, Tesla V100), or memory specifications. While a checklist at the end indicates "type of resources used... Yes", this information is not present in the paper's main content. |
| Software Dependencies | No | The paper mentions using various neural network architectures (e.g., Res Net, Mobile Net V2) and concepts (e.g., SGD, KL divergence), but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific libraries with their releases). |
| Experiment Setup | Yes | We perform a series of experiments to evaluate the effectiveness of our framework on seven convolutional neural networks with three image classification datasets. With the peer importance ωk, run T steps of SGD to update the model parameters θ with the weighted loss function in Eq.(1);... where η is the step size... We make the peer models toward {30%, 40%, 50%} pruning ratios on FLOPs for Res Net32 and Res Net56 on CIFAR10, and {30%, 50%, 70%} for Mobile Net V2 on Image Net. |