MCMAE: Masked Convolution Meets Masked Autoencoders
Authors: Peng Gao, Teli Ma, Hongsheng Li, Ziyi Lin, Jifeng Dai, Yu Qiao
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3 Experiments To validate our proposed MCMAE, we conduct experiments of image classification on Image Net1K [14] dataset. The pretrained MCMAE is also extensively tested on object detection and semantic segmentation. |
| Researcher Affiliation | Collaboration | Peng Gao1 Teli Ma1 Hongsheng Li1,2 Ziyi Lin2 Jifeng Dai3 Yu Qiao1 1 Shanghai AI Laboratory, Shanghai, China 2 MMLab, CUHK 3 Sense Time Research |
| Pseudocode | No | Not found. The paper describes the architecture and processes in text and figures but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and pretrained models are available at https://github.com/Alpha-VL/Conv MAE. |
| Open Datasets | Yes | 3 Experiments To validate our proposed MCMAE, we conduct experiments of image classification on Image Net1K [14] dataset. ... COCO dataset [39] has been widely adopted for benchmarking object detection frameworks. ... ADE20K [64] is a widely-used semantic segmentation dataset... To validate the video understanding ability of Video MCMAE, we pretrain on Kinetics-400 (K400) [32] and Something-something V2 (SSV2) [23] independently... |
| Dataset Splits | Yes | Image Net-1K [14] consists of 1.3M images of 1k categories for image classification and is split to the training and validation sets. ... The dataset is split into training, validation, and testing sets. ... We report the classification accuracy on the Image Net validation set of the finetuned and pretrained (linear probe) MCMAE encoders. ... We finetune Mask RCNN on COCO train2017 split and report AP box and AP mask on val2017 split. |
| Hardware Specification | No | Not found. The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | Not found. The paper mentions the AdamW optimizer but does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | Experimental Setup. ... We adopt a 1600-epoch cosine learning rate schedule with the first 40 epochs for warming up. The Adam W optimizer is utilized with a base learning rate of 1.5 10 4, a weight decay of 0.05 and a batch size of 1024. Random cropping is employed as data augmentation during pretraining. After pretraining, the MCMAE encoder is used for supervised finetuning on Image Net-1K training set for 100 epochs using the cosine learning rate schedule. We follow the default finetuning parameters of the original MAE [28] except for the layer-wise learning-rate decay parameters (0.65, 0.75, 0.85). |