Progressively Compressed Auto-Encoder for Self-supervised Representation Learning
Authors: Jin Li, Yaoming Wang, XIAOPENG ZHANG, Yabo Chen, Dongsheng Jiang, Wenrui Dai, Chenglin Li, Hongkai Xiong, Qi Tian
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that PCAE achieves comparable performance to MAE with only 1/8 GPU days. and Experiments over multiple benchmarks demonstrate the effectiveness of the proposed method. and This section is organized as follows: The comparisons between PCAE and related methods over multiple benchmarks including classification, and object detection are delivered in section 4.1. Section 4.2 exhibits the acceleration performance of PCAE in downstream tasks. Section 4.3 presents ablations on hyper-parameters of PCAE. (Section 4, EXPERIMENTS) |
| Researcher Affiliation | Collaboration | Jin Li1, Yaoming Wang1, Xiaopeng Zhang2 Yabo Chen1 Dongsheng Jiang2 Wenrui Dai1 Chenglin Li1 Hongkai Xiong1 Qi Tian2 1Shanghai Jiao Tong University 2Huawei Cloud {deserve lj, wang yaoming, chenyabo, daiwenmao, lcl1985, xionghongkai}@sjtu.edu.cn; zxphistory@gmail.com, dongsheng jiang@outlook.com, tian.qi1@huawei.com |
| Pseudocode | No | The paper does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | The code is available at https://github.com/caddyless/PCAE/ |
| Open Datasets | Yes | We run randomly initialized Vi T-B networks on Image Net validation set, Fine-tuning on Image Net-1k. We follow the standard MIM evaluation protocol that fine-tunes the pre-trained model on the ILSVRC-2012 Image Net for 100 epochs., object detection and instance segmentation on COCO |
| Dataset Splits | Yes | We run randomly initialized Vi T-B networks on Image Net validation set and Fine-tuning on Image Net-1k. We follow the standard MIM evaluation protocol that fine-tunes the pre-trained model on the ILSVRC-2012 Image Net for 100 epochs. |
| Hardware Specification | Yes | PCAE is able to accelerate training 2.25 times compared with MAE (He et al., 2022) (739.7 img/s v.s. 328.4 img/s, Vi T Base, 32 GB V100) and We report the throughput of PCAE on V-100 32G in Table 4. |
| Software Dependencies | No | The paper mentions software components implicitly through context and citations (e.g., in relation to deep learning frameworks), but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | The settings of data augmentation and optimization keep consistent with MAE (He et al., 2022). The mask ratio of 75% is applied on the input image across all experiments, and PCAE discards 50% tokens after 0th, 4th, and 8th layer respectively in default. [...] Specifically, we pre-train PCAE on Image Net for 100 epochs with batch size of 1024 and fine-tune the pre-trained model on Image Net for 100 epochs with the default schedule of MAE for a fair comparison. |