Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Layer Compression of Deep Networks with Straight Flows

Authors: Chengyue Gong, Xiaocong Du, Bhargav Bhushanam, Lemeng Wu, Xingchao Liu, Dhruv Choudhary, Arun Kejariwal, Qiang Liu

AAAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate that our method outperforms direct distillation and other baselines on different model architectures (e.g. Res Net, Vi T) on image classification and semantic segmentation tasks.
Researcher Affiliation Collaboration 1 University of Texas at Austin, 2 Meta, Inc.
Pseudocode Yes Algorithm 1: Compression with Straight Flows: Main Algorithm
Open Source Code No The paper does not provide an explicit statement about releasing their code or a link to a code repository for their proposed method.
Open Datasets Yes We evaluate our model performance on CIFAR-10 and Image Net, upon vision transformers and Res Net.
Dataset Splits Yes We evaluate our model performance on CIFAR-10 and Image Net, upon vision transformers and Res Net.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or cloud instance specifications used for running the experiments.
Software Dependencies No The paper mentions software like 'Adam W' and 'mmsegmentation' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes We use the Adam W (Loshchilov and Hutter 2018) optimizer with batch size 512 and an initial learning rate 5 10 4 with cosine learning rate decay (Loshchilov and Hutter 2019). For our method, the first two stage uses 300 epochs to train, respectively. For the final distillation refinement stage, we train the model with 400 epochs.