Exploring CLIP for Assessing the Look and Feel of Images

Authors: Jianyi Wang, Kelvin C.K. Chan, Chen Change Loy

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also provide extensive experiments on controlled datasets and Image Quality Assessment (IQA) benchmarks. Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments. We conduct experiments on three widely used NR-IQA benchmarks including LIVE-it W (Ghadiyaram and Bovik 2015) and Kon IQ10k (Hosu et al. 2020) for realistic camera distortions and SPAQ (Fang et al. 2020) for smartphone photography.
Researcher Affiliation Academia Jianyi Wang, Kelvin C.K. Chan, Chen Change Loy* S-Lab, Nanyang Technological University {jianyi001, chan0899, ccloy}@ntu.edu.sg
Pseudocode No The paper does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for CLIP-IQA or their methodology is open-source or publicly available.
Open Datasets Yes We conduct experiments on three widely used NR-IQA benchmarks including LIVE-it W (Ghadiyaram and Bovik 2015) and Kon IQ10k (Hosu et al. 2020) for realistic camera distortions and SPAQ (Fang et al. 2020) for smartphone photography. We conduct our experiments on the AVA dataset (Murray, Marchesotti, and Perronnin 2012) since it is among the largest visual perception datasets, containing over 250,000 images with a broad variety of content.
Dataset Splits No The paper mentions training on Kon IQ-10k and testing on Kon IQ-10k, LIVE-it W, and SPAQ, but does not provide specific details on training/validation/test splits (e.g., percentages or sample counts) for any dataset. It refers to 'test set of Kon IQ-10k' for synthetic data but not a validation split.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory).
Software Dependencies No The paper mentions 'Python package PIL' and refers to 'Res Net-50-based CLIP' and 'Co Op' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes We adopt SGD with a learning rate of 0.002 during training. The model is trained for 100000 iterations with batch size 64 on Kon IQ-10k dataset and the MSE loss is used to measure the distance between the predictions and the labels.