Batch-shaping for learning conditional channel gated networks
Authors: Babak Ehteshami Bejnordi, Tijmen Blankevoort, Max Welling
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present results on CIFAR-10 and Image Net datasets for image classification, and Cityscapes for semantic segmentation. Our results show that our method can slim down large architectures conditionally, such that the average computational cost on the data is on par with a smaller architecture, but with higher accuracy. |
| Researcher Affiliation | Industry | Babak Ehteshami Bejnordi, Tijmen Blankevoort & Max Welling Qualcomm AI Research Amsterdam, The Netherlands {behtesha,tijmen,mwelling}@qti.qualcomm.com |
| Pseudocode | Yes | The pseudo-code for the implementation of the Batch-Shaping loss is presented in the Appendix A. |
| Open Source Code | No | The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We evaluate the performance of our method on two image classification benchmarks: CIFAR-10 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015). We additionally report preliminary results on the Cityscapes semantic segmentation benchmark (Cordts et al., 2016). |
| Dataset Splits | Yes | The original PSP network achieves an overall Io U (intersection over union) of 0.706 with a pixel-level accuracy of 0.929 on the validation set. ... Figure 5 shows the distribution of gates on the Image Net validation set for our Res Net34-BAS and Res Net34-L0 models. |
| Hardware Specification | Yes | All the reported inference times were measured using a machine equipped with an Intel Xeon E5-1620 v4 CPU and an Nvidia GTX 1080 Ti GPU. |
| Software Dependencies | No | The paper mentions implementing aspects in Pytorch ("Computation can be done on sliced tensors, which we implemented in Pytorch."), but it does not specify any version numbers for Pytorch or any other software libraries or solvers used. |
| Experiment Setup | Yes | The training details and hyperparameters for our gated networks trained on CIFAR10, Image Net, and Cityscapes are provided in the appendix B. ... We trained the models for 500 epochs with a mini-batch of 256. The initial learning rate was 0.1 and it was divided by 10 at epoch 300, 375, and 450. ... For the L0-loss we used γ values of {0, 1, 2, 5, 10, 15, 20} 10 2 to generate different trade-off points. |