Prototypical Contrastive Predictive Coding
Authors: Kyungmin Lee
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we validate the effectiveness of our method compared to various supervised and self-supervised knowledge distillation baselines. |
| Researcher Affiliation | Academia | Kyungmin Lee Agency for Defense Development kyungmnlee@gmail.com |
| Pseudocode | Yes | The Py Torch style pseudo-code for our Proto CPC is demonstrated in Algorithm 1. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the described methodology. |
| Open Datasets | Yes | We experiment on CIFAR-100 (Krizhevsky et al., 2009) and Image Net (Deng et al., 2009) |
| Dataset Splits | Yes | Table 3: Top-1 and Top-5 error rates (%) of student network Res Net-18 on Image Net validation set. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., GPU models, CPU models, or cloud computing instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch style pseudocode' but does not specify version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For CIFAR-100, we initialize the learning rate as 0.05, and decay it by 0.1 every 30 epochs after the first 150 epochs until the last 240 epoch. [...] Batch size is 64 for CIFAR-100 or 256 for Image Net. [...] For probability of teacher, we use SK operator with 3 steps of iteration and τt = 0.04. For probability of student, we set τs = 0.1. The prior momentum for Proto CPC loss is 0.9. We use SGD optimizer with batch size 512 and weight decay is 1e-4. The learning rate is 0.6 and is decayed by cosine learning rate schedule to 1e-6. |