On Efficient Transformer-Based Image Pre-training for Low-Level Vision
Authors: Wenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To comprehensively diagnose the influence of pre-training, we design a whole set of principled evaluation tools that uncover its effects on internal representations. and Based on the study, we successfully develop state-of-theart models for multiple low-level tasks. |
| Researcher Affiliation | Collaboration | Wenbo Li1 , Xin Lu2* , Shengju Qian1 and Jiangbo Lu3 1The Chinese University of Hong Kong 2Deeproute.ai 3Smart More Corporation |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | Following [Chen et al., 2021], we adopt the Image Net [Deng et al., 2009] dataset in the pre-training stage. |
| Dataset Splits | No | The paper mentions using a 'test dataset' for CKA computation and 'fine-tuning is performed on a single task', but does not explicitly provide details about training, validation, or test dataset splits in the main text. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments with specific models or types. |
| Software Dependencies | No | The paper does not provide specific version numbers for key software components or libraries used in the experiments. |
| Experiment Setup | Yes | We uniformly set the block number in each transformer stage to 6, the expansion ratio of the feed-forward network (FFN) to 2 and the window size to (6, 24). and training patch size is 64x64 (ours is 48x48). (from Table 3 footnote). |