Layer Compression of Deep Networks with Straight Flows

Authors: Chengyue Gong, Xiaocong Du, Bhargav Bhushanam, Lemeng Wu, Xingchao Liu, Dhruv Choudhary, Arun Kejariwal, Qiang Liu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate that our method outperforms direct distillation and other baselines on different model architectures (e.g. Res Net, Vi T) on image classification and semantic segmentation tasks.
Researcher Affiliation Collaboration 1 University of Texas at Austin, 2 Meta, Inc.
Pseudocode Yes Algorithm 1: Compression with Straight Flows: Main Algorithm
Open Source Code No The paper does not provide an explicit statement about releasing their code or a link to a code repository for their proposed method.
Open Datasets Yes We evaluate our model performance on CIFAR-10 and Image Net, upon vision transformers and Res Net.
Dataset Splits Yes We evaluate our model performance on CIFAR-10 and Image Net, upon vision transformers and Res Net.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or cloud instance specifications used for running the experiments.
Software Dependencies No The paper mentions software like 'Adam W' and 'mmsegmentation' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes We use the Adam W (Loshchilov and Hutter 2018) optimizer with batch size 512 and an initial learning rate 5 10 4 with cosine learning rate decay (Loshchilov and Hutter 2019). For our method, the first two stage uses 300 epochs to train, respectively. For the final distillation refinement stage, we train the model with 400 epochs.