Memory-Oriented Structural Pruning for Efficient Image Restoration

Authors: Xiangsheng Shi, Xuefei Ning, Lidong Guo, Tianchen Zhao, Enshu Liu, Yi Cai, Yuhan Dong, Huazhong Yang, Yu Wang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on real image denoising, image super resolution and low-light image enhancement show that MOSP can yield models with higher memory efficiency while better preserving performance compared with baseline pruning methods.
Researcher Affiliation Academia 1 Department of Electronic Engineering, Tsinghua University 2 Shenzhen International Graduate School, Tsinghua University 3 School of Materials Science and Engineering, Tsinghua University
Pseudocode Yes To get a solution for the problem defined in Eqn. 1, we propose a memory-oriented pruning flow (see Appendix for detailed algorithm).
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes For real image denoising, we use 320 high-resolution images in the SIDD dataset (Abdelhamed, Lin, and Brown 2018) as the training data.
Dataset Splits No The paper mentions '320 high-resolution images... as the training data' and '1,280 validation patches in SIDD' for evaluation. However, it does not provide a complete training/test/validation split with explicit percentages or absolute counts for all three categories from a single dataset, nor does it explicitly define a 'test' set separate from 'validation'.
Hardware Specification No The paper does not specify any particular GPU models, CPU types, or other detailed hardware specifications used for running the experiments.
Software Dependencies No The paper mentions using 'Adam optimizer' but does not provide specific version numbers for any software libraries (e.g., Python, PyTorch, TensorFlow) or other dependencies.
Experiment Setup Yes The models are trained on 256 256 patches with a batch size of 32. Random horizontal and vertical flips are applied to the training patches as data augmentation. We use Adam optimizer (Kingma and Ba 2014) with β1 = 0.9, β2 = 0.999, and ϵ = 1e 8. In the pretrain stage, we train the baseline models for 80 epochs, with the initial learning rate set as 1 10 4 and decreased to half every 20 epochs. ... The learning rate is set to 1 10 4 and decreased to 5 10 5 after 10 epochs. As for MOSP hyper-parameters, the outer memory stride is by default 2MB and the inner memory step is set to be the highest per-channel memory in the current selected group.