CLEARER: Multi-Scale Neural Architecture Search for Image Restoration

Authors: Yuanbiao Gou, Boyun Li, Zitao Liu, Songfan Yang, Xi Peng

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show the promising performance of our method comparing with nine image denoising methods and eight image deraining approaches in quantitative and qualitative evaluations.
Researcher Affiliation Collaboration Yuanbiao Gou College of Computer Science Sichuan University, China gouyuanbiao@gmail.com Boyun Li College of Computer Science Sichuan University, China liboyun.gm@gmail.com Zitao Liu TAL Education Group Beijing, China liuzitao@100tal.com Songfan Yang TAL Education Group Beijing, China songfan.yang@qq.com Xi Peng College of Computer Science Sichuan University, China pengx.gm@gmail.com
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The codes are available at https://github.com/limit-scu.
Open Datasets Yes We carry out denoising experiments on three datasets, i.e., BSD500 [28], BSD68 [48], and Set12.
Dataset Splits Yes We utilize the training set and validation set from BSD500 to train and find the best neural architecture with the highest performance. For a fair comparison, we randomly sample 100 images from the training images for validation and use the remaining 600 images for training. All the test images are used for testing.
Hardware Specification Yes The proposed method is time efficient, which only takes two hours to search architectures using a single V100 GPU. CLEARER 1 Tesla V100 6.50 gradient.
Software Dependencies No The paper mentions optimizers (SGD, Adam) and learning rate strategies (cosine annealing) but does not provide specific software library names with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup Yes We adopt the standard SGD optimizer with the momentum of 0.9 and the weight decay of 0.0003 to optimize the parametric model. The learning rate automatically decays from 0.025 to 0.001 via the cosine annealing strategy [25]. To optimize the architecture parameters, we adopt Adam [13] optimizer with the learning rate of 0.0003, and the weight decay of 0.001. In the search process, we build a data batch with the size of 32 by randomly cropping patches of 64 64 from training images, and feed the batch to the network with the maximal iteration of 10,000. For fair comparisons, we simply set λ1 = 0.01 and λ2 = 0 by ignoring the model complexity like the compared approaches.