reproducibilityindex.ai

Wavelet Feature Maps Compression for Image-to-Image CNNs

Authors: Shahaf E. Finder, Yair Zohav, Maor Ashkenazi, Eran Treister

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experiment with various tasks that beneﬁt from high-resolution input. By combining WCC with light quantization, we achieve compression rates equivalent to 1-4bit activation quantization with relatively small and much more graceful degradation in performance. Our code is available at https://github.com/BGUComp Sci/Wavelet Compressed Convolution. Section 5 is dedicated to experimental evaluation, presenting results in tables (e.g., Table 1, 2, 3, 4) and figures (e.g., Figure 1, 3, 4, 5) across multiple datasets and tasks, including object detection, semantic segmentation, monocular depth estimation, and super-resolution.
Researcher Affiliation	Academia	Shahaf E. Finder , Yair Zohav , Maor Ashkenazi , Eran Treister The Department of Computer Science, Ben-Gurion University [finders,maorash]@post.bgu.ac.il erant@cs.bgu.ac.il
Pseudocode	Yes	The workﬂow is illustrated in Figure 2, and an explicit algorithm appears in Appendix B. Appendix B is titled 'Explicit Algorithm'.
Open Source Code	Yes	Our code is available at https://github.com/BGUComp Sci/Wavelet Compressed Convolution.
Open Datasets	Yes	We train and evaluate the networks on the MS COCO 2017 [39] object detection dataset. We evaluated our proposed method on the Cityscapes and Pascal VOC datasets. The Cityscapes dataset [12]... The Pascal VOC [19] dataset... We evaluated the results on the KITTI dataset [21]... For this task, we chose the popular EDSR network [38], trained on the DIV2K dataset [1].
Dataset Splits	Yes	The MS COCO 2017 dataset contains 118K training images and 5K validation images. For Cityscapes, 'During training, we used a random crop of size 768 768 and no crop for the validation set.' For Monodepth2, 'The train/validation split is the default selected by Monodepth2 (based on [71]), and we evaluate it on the ground truths provided by the KITTI depth benchmark.'
Hardware Specification	Yes	We ran our experiments on NVIDIA 24GB RTX 3090 GPU.
Software Dependencies	No	The paper states 'We implemented our code using Py Torch [47], based on Torchvision and public implementations of the chosen networks.' However, it does not provide specific version numbers for PyTorch or Torchvision, which are required for a reproducible description of software dependencies.
Experiment Setup	Yes	For object detection, 'We use the Adam W optimizer, with a learning rate of 10 3 when initially applying WCC layers and 10 4 for ﬁnetuning. In addition, we apply a learning rate warm-up in the ﬁrst epoch of training, followed by a cosine learning rate decay. Each compression step is ﬁnetuned for 20 to 40 epochs.' Similar detailed settings are provided for semantic segmentation and monocular depth estimation in their respective sections.