Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Layer Compression of Deep Networks with Straight Flows
Authors: Chengyue Gong, Xiaocong Du, Bhargav Bhushanam, Lemeng Wu, Xingchao Liu, Dhruv Choudhary, Arun Kejariwal, Qiang Liu
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we demonstrate that our method outperforms direct distillation and other baselines on different model architectures (e.g. Res Net, Vi T) on image classification and semantic segmentation tasks. |
| Researcher Affiliation | Collaboration | 1 University of Texas at Austin, 2 Meta, Inc. |
| Pseudocode | Yes | Algorithm 1: Compression with Straight Flows: Main Algorithm |
| Open Source Code | No | The paper does not provide an explicit statement about releasing their code or a link to a code repository for their proposed method. |
| Open Datasets | Yes | We evaluate our model performance on CIFAR-10 and Image Net, upon vision transformers and Res Net. |
| Dataset Splits | Yes | We evaluate our model performance on CIFAR-10 and Image Net, upon vision transformers and Res Net. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or cloud instance specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions software like 'Adam W' and 'mmsegmentation' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | We use the Adam W (Loshchilov and Hutter 2018) optimizer with batch size 512 and an initial learning rate 5 10 4 with cosine learning rate decay (Loshchilov and Hutter 2019). For our method, the first two stage uses 300 epochs to train, respectively. For the final distillation refinement stage, we train the model with 400 epochs. |