Data-Efficient Image Recognition with Contrastive Predictive Coding

Authors: Olivier Henaff

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We revisit CPC in terms of its architecture and training methodology, and arrive at a new implementation with a dramatically-improved ability to linearly separate image classes (from 48.7% to 71.5% Top-1 Image Net classification accuracy, a 23% absolute improvement), setting a new state-of-the-art.
Researcher Affiliation Collaboration 1Deep Mind, London, UK 2University of California, Berkeley.
Pseudocode No The paper describes the Contrastive Predictive Coding objective with mathematical formulas and architectural components in prose and diagrams, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets Yes In all cases, the dataset of unlabeled images Du we pre-train on is the full Image Net ILSVRC 2012 training set (Russakovsky et al., 2015). We consider three labeled datasets Dl for evaluation, each with an associated classifier hψ and supervised loss LSup (see Fig. 2, right)... Transfer learning tests the generality of the representation by applying it to a new task and dataset: object detection on the PASCAL VOC 2007 dataset, a standard benchmark in computer vision (Everingham et al., 2007).
Dataset Splits No The paper specifies using the 'Image Net ILSVRC 2012 training set' for pre-training and 'random subset of the Image Net dataset: we investigated using 1%, 2%, 5%, 10%, 20%, 50% and 100% of the dataset' for labeled training. It mentions cross-validation for choosing the number of epochs during fine-tuning, and 'custom validation set' in Figure 3 caption for intermediate results. However, it does not provide specific, reproducible percentages or absolute counts for a general validation split used across all main experiments for reproducibility.
Hardware Specification No The paper describes the neural network architectures (e.g., Res Net-161, Res Net-33) and training procedures, but it does not specify any hardware details such as GPU models, CPU types, or cloud computing instances used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions, or specific library versions) that would be needed to replicate the experiment environment.
Experiment Setup Yes We revisit CPC in terms of its architecture and training methodology... We consider three labeled datasets Dl for evaluation... The overarching principle behind our new model design is to increase the scale and efficiency of the encoder architecture... We identify four axes for model capacity and task setup that could impact the model s performance. The first axis increases model capacity by increasing depth and width... The third axis increases task complexity by making predictions in all four directions, and the fourth does so by performing more extensive patch-based augmentation... After tuning the supervised model for low-data classification (varying network depth, regularization, and optimization parameters) and extensive use of data-augmentation (including the transformations used for CPC pre-training)... During an initial phase we keep the CPC feature extractor fixed and train the Res Net classifier till convergence... We then fine-tune the entire stack hψfθ for the supervised objective, for a small number of epochs (chosen by cross-validation).