On Efficient Transformer-Based Image Pre-training for Low-Level Vision

Authors: Wenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To comprehensively diagnose the influence of pre-training, we design a whole set of principled evaluation tools that uncover its effects on internal representations. and Based on the study, we successfully develop state-of-theart models for multiple low-level tasks.
Researcher Affiliation Collaboration Wenbo Li1 , Xin Lu2* , Shengju Qian1 and Jiangbo Lu3 1The Chinese University of Hong Kong 2Deeproute.ai 3Smart More Corporation
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes Following [Chen et al., 2021], we adopt the Image Net [Deng et al., 2009] dataset in the pre-training stage.
Dataset Splits No The paper mentions using a 'test dataset' for CKA computation and 'fine-tuning is performed on a single task', but does not explicitly provide details about training, validation, or test dataset splits in the main text.
Hardware Specification No The paper does not explicitly describe the hardware used for running its experiments with specific models or types.
Software Dependencies No The paper does not provide specific version numbers for key software components or libraries used in the experiments.
Experiment Setup Yes We uniformly set the block number in each transformer stage to 6, the expansion ratio of the feed-forward network (FFN) to 2 and the window size to (6, 24). and training patch size is 64x64 (ours is 48x48). (from Table 3 footnote).