MetaISP: Efficient RAW-to-sRGB Mappings with Merely 1M Parameters
Authors: Zigeng Chen, Chaowei Liu, Yuan Yuan, Michael Bi Mi, Xinchao Wang
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on two publicly available RAW to s RGB datasets: the Zurich RAW to RGB (ZRR) dataset [Ignatov et al., 2020b] and the MAI2021 dataset [Ignatov et al., 2022b]. Our proposed model, Meta ISP, was benchmarked against four cutting-edge models on the ZRR dataset, demonstrating superior performance in all metrics, including PSNR, SSIM and E (see Table 1). Our Meta ISP achieves the best performance on two largescale datasets and meanwhile being impressive computational efficiency and significantly lightweight. |
| Researcher Affiliation | Collaboration | Zigeng Chen1, Chaowei Liu1, Yuan Yuan2, Michael Bi Mi2, Xinchao Wang1 1National University of Singapore 2Huawei Technologies Ltd zigeng99@u.nus.edu, e1011116@u.nus.edu, xinchao@nus.edu.sg |
| Pseudocode | No | The paper describes the proposed method using architectural diagrams (Figure 2, Figure 4) but does not include any explicit pseudocode blocks or algorithms. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We conduct experiments on two publicly available RAW to s RGB datasets: the Zurich RAW to RGB (ZRR) dataset [Ignatov et al., 2020b] and the MAI2021 dataset [Ignatov et al., 2022b]. |
| Dataset Splits | No | For the ZRR dataset, all the training images were augmented by random horizontal and vertical flipping during the training. ... We follow the official division that 46.8k are used for training and 1.2k are used for testing. ... For the MAI21 dataset, no data augmentation methods were employed throughout the training phase. ... The dataset was then randomly divided into two parts, with 23k samples allocated for training and 1k samples reserved for testing. (The paper specifies training and testing splits, but does not explicitly mention a validation split.) |
| Hardware Specification | Yes | Our model is implemented in Py Torch [Paszke et al., 2019] and trained on 4 Nvidia Titan X GPUs with a batch size of 32. |
| Software Dependencies | No | Our model is implemented in Py Torch [Paszke et al., 2019] and trained on 4 Nvidia Titan X GPUs with a batch size of 32. The parameters of the network are optimized with ADAM [Kingma and Ba, 2014] algorithm. (While PyTorch is mentioned, a specific version number for the software dependency is not provided.) |
| Experiment Setup | Yes | Our model is implemented in Py Torch [Paszke et al., 2019] and trained on 4 Nvidia Titan X GPUs with a batch size of 32. The parameters of the network are optimized with ADAM [Kingma and Ba, 2014] algorithm. For the ZRR dataset, all the training images were augmented by random horizontal and vertical flipping during the training. ... First, the model is trained for 80 epochs with an initial learning rate of 5e 4 which is decayed to half after 50 epochs. The loss function is a combination of VGG-based perceptual loss [Johnson et al., 2016], SSIM loss [Wang et al., 2004] and Charbonier loss [Zhang et al., 2018]: LStage1 = 0.25 LChar + LSSIM + LV GG. Next, the model is fine-tuned for an additional 5 epochs with a learning rate of 2e 5. Only MSE loss and SSIM loss are employed for final tone adjustments and edge rendering: LStage2 = 0.5 LMSE + LSSIM. |