Progressively Compressed Auto-Encoder for Self-supervised Representation Learning

Authors: Jin Li, Yaoming Wang, XIAOPENG ZHANG, Yabo Chen, Dongsheng Jiang, Wenrui Dai, Chenglin Li, Hongkai Xiong, Qi Tian

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that PCAE achieves comparable performance to MAE with only 1/8 GPU days. and Experiments over multiple benchmarks demonstrate the effectiveness of the proposed method. and This section is organized as follows: The comparisons between PCAE and related methods over multiple benchmarks including classification, and object detection are delivered in section 4.1. Section 4.2 exhibits the acceleration performance of PCAE in downstream tasks. Section 4.3 presents ablations on hyper-parameters of PCAE. (Section 4, EXPERIMENTS)
Researcher Affiliation Collaboration Jin Li1, Yaoming Wang1, Xiaopeng Zhang2 Yabo Chen1 Dongsheng Jiang2 Wenrui Dai1 Chenglin Li1 Hongkai Xiong1 Qi Tian2 1Shanghai Jiao Tong University 2Huawei Cloud {deserve lj, wang yaoming, chenyabo, daiwenmao, lcl1985, xionghongkai}@sjtu.edu.cn; zxphistory@gmail.com, dongsheng jiang@outlook.com, tian.qi1@huawei.com
Pseudocode No The paper does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes The code is available at https://github.com/caddyless/PCAE/
Open Datasets Yes We run randomly initialized Vi T-B networks on Image Net validation set, Fine-tuning on Image Net-1k. We follow the standard MIM evaluation protocol that fine-tunes the pre-trained model on the ILSVRC-2012 Image Net for 100 epochs., object detection and instance segmentation on COCO
Dataset Splits Yes We run randomly initialized Vi T-B networks on Image Net validation set and Fine-tuning on Image Net-1k. We follow the standard MIM evaluation protocol that fine-tunes the pre-trained model on the ILSVRC-2012 Image Net for 100 epochs.
Hardware Specification Yes PCAE is able to accelerate training 2.25 times compared with MAE (He et al., 2022) (739.7 img/s v.s. 328.4 img/s, Vi T Base, 32 GB V100) and We report the throughput of PCAE on V-100 32G in Table 4.
Software Dependencies No The paper mentions software components implicitly through context and citations (e.g., in relation to deep learning frameworks), but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The settings of data augmentation and optimization keep consistent with MAE (He et al., 2022). The mask ratio of 75% is applied on the input image across all experiments, and PCAE discards 50% tokens after 0th, 4th, and 8th layer respectively in default. [...] Specifically, we pre-train PCAE on Image Net for 100 epochs with batch size of 1024 and fine-tune the pre-trained model on Image Net for 100 epochs with the default schedule of MAE for a fair comparison.