Multi-Task Zipping via Layer-wise Neuron Sharing

Authors: Xiaoxi He, Zimu Zhou, Lothar Thiele

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments We evaluate the performance of MTZ on zipping networks pre-trained for the same task (Sec. 4.1) and different tasks (Sec. 4.2 and Sec. 4.3). We mainly assess the test errors of each task after network zipping and the retraining overhead involved. MTZ is implemented with Tensor Flow.All experiments are conducted on a workstation equipped with Nvidia Titan X (Maxwell) GPU.
Researcher Affiliation Academia Xiaoxi He ETH Zurich hex@ethz.ch Zimu Zhou ETH Zurich zzhou@tik.ee.ethz.ch Lothar Thiele ETH Zurich thiele@ethz.ch
Pseudocode Yes Algorithm 1: Multi-task Zipping via Layer-wise Neuron Sharing
Open Source Code No The paper states 'MTZ is implemented with Tensor Flow.' but does not provide a specific link or explicit statement about the availability of the source code for the proposed methodology.
Open Datasets Yes We experiment on MNIST dataset with the Le Net-300-100 and Le Net-5 networks [14] to recognize handwritten digits from zero to nine. We explore to merge two VGG-16 networks trained on the Image Net ILSVRC2012 dataset [24] for object classification and the Celab A dataset [16] for facial attribute classification.
Dataset Splits No The paper discusses 'test errors' and 'retraining', but does not explicitly provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or detailed splitting methodology).
Hardware Specification Yes All experiments are conducted on a workstation equipped with Nvidia Titan X (Maxwell) GPU.
Software Dependencies No The paper states 'MTZ is implemented with Tensor Flow.' but does not provide a specific version number for TensorFlow or any other software dependencies.
Experiment Setup Yes All the networks are initialized randomly with different seeds, and the training data are also shuffled before every training epoch. After training, the ordering of neurons/kernels in all hidden layers is once more randomly permuted. The training of Le Net-300-100 and Le Net-5 networks requires 1.05 104 and 1.1 104 iterations in average, respectively.