RAN4IQA: Restorative Adversarial Nets for No-Reference Image Quality Assessment

Authors: Hongyu Ren, Diqi Chen, Yizhou Wang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on Waterloo Exploration, LIVE and TID2013 show the effectiveness and generalization ability of RAN compared to the state-of-the-art NR-IQA models.
Researcher Affiliation Academia Nat l Engineering Laboratory for Video Technology, Cooperative Medianet Innovation Center Key Laboratory of Machine Perception (Mo E), School of Electronics Engineering and Computer Science, Peking University Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, 100190, China University of Chinese Academy of Sciences, Beijing, 100049, China {rhy, cdq, yizhou.wang}@pku.edu.cn
Pseudocode No The paper provides architectural diagrams (Figure 1) detailing the network structures but no pseudocode or algorithm blocks.
Open Source Code No No explicit statement about releasing source code or a link to a code repository was found.
Open Datasets Yes We pretrain RAN on Waterloo Exploration (Ma et al. 2017), perform cross validation on TID2013 (Ponomarenko et al. 2015) and LIVE (Sheikh et al. 2005).
Dataset Splits Yes On TID2013, we randomly pick 60% as the training set, 20% as the validation set and the left 20% as the test set. We also perform two-sided t-test on average SROCC and PLCC between the proposed model and all the other mentioned NR-IQA models. The null hypothesis is that the two IQA methods have equal SROCC at the 95% confidence level. An alternative hypothesis is that our model has higher/less correlation results. We also make a similar hypothesis on PLCC. We randomly split the TID2013, train, test for 15 times and achieve the results for all the models. To further test the robustness and generalization of RAN, we test our model on LIVE, where we split the dataset into 6:2:2 for train, validation and test respectively.
Hardware Specification Yes All the training is performed on an NVIDIA Tesla K40 GPU.
Software Dependencies No The implementation is on tensorflow (Abadi et al. 2016). No specific version numbers for software dependencies were provided beyond the mention of TensorFlow.
Experiment Setup Yes All the training is performed on an NVIDIA Tesla K40 GPU. We crop every image into 64 64 non-overlapping patches. The implementation is on tensorflow (Abadi et al. 2016). In the pretrain step, we first train the restorator based on the labels (pristine patches) to avoid unwanted minima using Adam optimizer (Kingma and Ba 2014) at a learning rate of 10 4 for 300, 000 iterations. Then we train the restorator and the discriminator together using RMSProp (Tieleman and Hinton 2012) at a learning rate of 10 4 for 300, 000 iterations and a lower learning rate of 10 5 for another 300, 000 iterations. In each iteration, we train the discriminator 5 times and the restorator once. Then we freeze the weights of restorator and discriminator, and pretrain the evaluator using Adam at a learning rate of 10 4 for 300, 000 iterations. In the finetune step, we also freeze the weights of the restorator and discriminator, only train the evaluator using the same optimizer and learning rate on different datasets, which will be elucidated in the next section. Since Waterloo Exploration only contains 4 distortion types, we finetune and test RAN on them: Gaussian Blur, White Noise, JPEG and JP2K. On TID2013, we randomly pick 60% as the training set, 20% as the validation set and the left 20% as the test set. We finetune the evaluator for 20, 000 iterations. LIVE has fewer images compared to TID2013, we finetune the evaluator for 15, 000 iterations.