Revisiting Dynamic Convolution via Matrix Decomposition
Authors: Yunsheng Li, Yinpeng Chen, Xiyang Dai, mengchen liu, Dongdong Chen, Ye Yu, Lu Yuan, Zicheng Liu, Mei Chen, Nuno Vasconcelos
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present the results of DCD on Image Net classification (Deng et al., 2009). Image Net has 1,000 classes with 1,281,167 training and 50, 000 validation images. We also report ablation studies on different components of the approach. |
| Researcher Affiliation | Collaboration | Yunsheng Li1, Yinpeng Chen2, Xiyang Dai2, Mengchen Liu2, Dongdong Chen2, Ye Yu2, Lu Yuan2, Zicheng Liu2, Mei Chen2, Nuno Vasconcelos1 1 Department of Electrical and Computer Engineering, University of California San Diego 2 Microsoft |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code is at https://github.com/liyunsheng13/dcd. |
| Open Datasets | Yes | In this section, we present the results of DCD on Image Net classification (Deng et al., 2009). Image Net has 1,000 classes with 1,281,167 training and 50, 000 validation images. |
| Dataset Splits | Yes | Image Net has 1,000 classes with 1,281,167 training and 50, 000 validation images. All models are trained by SGD with momentum 0.9. The batch size is 256 and remaining training parameters are as follows. |
| Hardware Specification | Yes | We use a single-threaded core AMD EPYC CPU 7551P (2.0 GHz) to measure running time (in milliseconds) on Mobile Net V2 0.5 and 1.0. |
| Software Dependencies | No | The paper mentions "Py Torch" but does not specify a version number or any other software dependencies with versions. |
| Experiment Setup | Yes | All models are trained by SGD with momentum 0.9. The batch size is 256 and remaining training parameters are as follows. Res Net: The learning rate starts at 0.1 and is divided by 10 every 30 epochs. The model is trained with 100 epochs. Dropout (Srivastava et al., 2014) 0.1 is used only for Res Net-50. Mobile Net V2: The initial learning rate is 0.05 and decays to 0 in 300 epochs, according to a cosine function. Weight decay of 2e-5 and a dropout rate of 0.1 are also used. |