Towards Understanding Why Mask Reconstruction Pretraining Helps in Downstream Tasks
Authors: Jiachun Pan, Pan Zhou, Shuicheng YAN
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results testify to our data assumptions and also our theoretical implications. (Abstract); Experimental results testify to our data assumptions and also our theoretical implications. (Introduction); Section 5: Experiments (Section title) |
| Researcher Affiliation | Collaboration | Jiachun Pan1,2 Pan Zhou1 Shuicheng Yan1 1 Sea AI Lab 2National University of Singapore pan.jiachun@u.nus.edu {zhoupan,yansc}@sea.com |
| Pseudocode | No | The paper describes mathematical derivations and processes (e.g., in Appendix E.1 'The derivation of above loss function is shown as follows.'), but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps. |
| Open Source Code | No | Footnotes 2, 3, 4 in Appendix B provide links to 'official trained SL model', 'official trained MAE model', and 'official trained data2vec model'. These are models/frameworks from other research groups that the authors used for comparison or as baselines, not the source code for their own proposed methodology. There is no statement from the authors about releasing their own code. |
| Open Datasets | Yes | We pretrain for 300 epochs on Image Net, and fine-tune pretrained Res Net50 for 100 epochs on Image Net. (Section 5); on Image Net (Deng et al., 2009) (Section 5); VOC07+12 (Table 1) |
| Dataset Splits | No | We pretrain for 300 epochs on Image Net, and fine-tune pretrained Res Net50 for 100 epochs on Image Net. (Section 5). The paper does not specify how the data was split into training, validation, and test sets, nor does it explicitly mention the use of a validation set for hyperparameter tuning or early stopping. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory specifications). |
| Software Dependencies | No | The paper mentions using 'Res Net50 (He et al., 2016b) trained by the Pytorch Team' and 'Vi T-base models (Dosovitskiy et al., 2020)', but it does not specify any software dependencies (e.g., libraries, frameworks) with their version numbers required to reproduce the experiments. |
| Experiment Setup | No | The paper mentions 'pretrain for 300 epochs on Image Net, and fine-tune pretrained Res Net50 for 100 epochs on Image Net.' (Section 5) and describes the masking process (e.g., 'randomly mask input patches', 'Pr(ϵi = 1) = θ'). It also states 'the learning rate η1 is often much smaller than η2 in practice.' (Section 3.2). However, concrete values for hyperparameters like specific learning rates, batch sizes, or optimizer details are not provided in the main text. |