Gaussian Kernel Mixture Network for Single Image Defocus Deblurring

Authors: Yuhui Quan, Zicong Wu, Hui Ji

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that the GKMNet not only noticeably outperforms existing defocus deblurring methods, but also has its advantages in terms of model complexity and computational efficiency. The GKMNet is trained and tested on the DPD dataset without using the dual-pixel images. In addition, we test GKMNet on the RTF test set [7] as well as the images from CUHK-BD dataset [42]. Three image quality metrics are used for quantitative evaluation, including two standard metrics: PSNR (Peak Signal to Noise Ratio) and SSIM (Structural Similarity Index Measure) [43], and the LPIPS (Learned Perceptual Image Patch Similarity) [44] for perceptual quality (also used in [18]).
Researcher Affiliation Academia Yuhui Quan , Zicong Wu School of Computer Science and Engineering South China University of Technology Hui Ji Department of Mathematics National University of Singapore
Pseudocode No The paper describes the network architecture with diagrams and mathematical equations, but it does not provide explicit pseudocode or an algorithm block.
Open Source Code Yes The code of GKMNet is available at https://github.com/cs Zc Wu/GKMNet.
Open Datasets Yes There are few datasets available for benchmarking defocus deblurring. The most well-known one is the DPD dataset [18]. It provides 500 pairs of images with defocus blur and their corresponding all-in-focus images, as well as the two associated sub-aperture views called dual-pixel images, all in 16-bit color. The training/validation/test splits in the dataset consist of 350/74/76 samples.
Dataset Splits Yes The training/validation/test splits in the dataset consist of 350/74/76 samples.
Hardware Specification Yes The computational efficiency is measured by the average inference time on an image of 1680 1120 pixels, tested on an Intel i5-9600KF CPU and on an NVIDIA GTX 1080Ti GPU, respectively. Its training on the DPD dataset takes less than 32 hours, while SRN and Att Net take nearly three days, on an NVDIA GTX 1080Ti GPU.
Software Dependencies No The paper mentions "The Adam optimizer [46] is used for training" and "Automatic differentiation in pytorch. 2017" (referencing [46]), implying PyTorch is used, but it does not specify concrete version numbers for PyTorch or other key software components like CUDA.
Experiment Setup Yes Through all experiments, the maximum size of Gaussian kernels in GCM is set to M = 21. The number of scales is set to T = 3. The learnable parameters in SRAM are initialized by Xavier [45]. The Adam optimizer [46] is used for training with 3000 epochs and batch size 4. The learning rate is fixed at 10 4 in the first 2000 epochs and decayed to 10 5 in the last 1000 epochs. The cost function in (10) is set to the squared ℓ2 loss in the first 1000 iterations, and alternatively set to the SSIM loss and squared ℓ2 loss every 500 epochs afterwards. Data augmentation is done by random cropping to 256 256 pixels.