Learning Efficient Coding of Natural Images with Maximum Manifold Capacity Representations

Authors: Thomas Yerxa, Yilun Kuang, Eero Simoncelli, SueYeon Chung

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here, we simplify this measure to a form that facilitates direct optimization, use it to learn Maximum Manifold Capacity Representations (MMCRs), and demonstrate that these are competitive with state-of-the-art results on current self-supervised learning (SSL) recognition benchmarks. Empirical analyses reveal important differences between MMCRs and the representations learned by other SSL frameworks, and suggest a mechanism by which manifold compression gives rise to class separability. Finally, we evaluate a set of SSL methods on a suite of neural predictivity benchmarks, and find MMCRs are highly competitive as models of the primate ventral stream.
Researcher Affiliation Collaboration Thomas Yerxa 1 Yilun Kuang 2,3 Eero Simoncelli 1,2,3 Sue Yeon Chung1,2 1Center for Neural Science, New York University 2Center for Computational Neuroscience, Flatiron Institute 3Courant Inst. of Mathematical Sciences, tey214@nyu.edu
Pseudocode Yes B Pytorch Style Pseudocode for MMCR
Open Source Code No The paper references open-source code for baseline methods (e.g., Mo Co, Barlow Twins, BYOL from solo-learn; SwAV, SimCLR from VISSL), but does not provide a link or explicit statement for the open-sourcing of their own MMCR implementation.
Open Datasets Yes Table 1: Evaluation of learned features on downstream classification tasks... Image Net (IN)...Food-101 Flowers-102 DTD. Table 4: Top-1 classification accuracies of linear classifiers for representations trained with various datasets... CIFAR-10 CIFAR-100 STL-10 Image Net-100... Appendix M: ...fine tuning the representation network with a Faster R-CNN head and C-4 backbone on the VOC07+12 dataset...
Dataset Splits Yes Columns 2 and 3 show semi-supervised evaluation on Image Net (fine-tuning on 1% and 10% of labels). ... We also perform semi-supervised evaluation, where all model parameters are fine tuned using a small number of labelled examples...
Hardware Specification Yes Pre-training on 16 A100 GPUs using 8 views (our most compute intensive setting) takes approximately 32 hours.
Software Dependencies No The paper mentions software components like 'Adam optimizer (46)', 'LARS optimizer', 'SGD optimizer', 'detectron2 library', and implies 'PyTorch' through pseudocode, but it does not specify version numbers for these dependencies.
Experiment Setup Yes For Image Net we used the LARS optimizer with a learning rate of 4.8, linear warmup during the first 10 epochs and cosine decay thereafter with a batchsize of 2048, and pre-train for 100 epochs. ... For smaller CIFAR-10 we used a smaller batch size, many more views (40), and the Adam optimizer with fixed learning rate. ...All models were trained for 500 epochs using the Adam optimizer (46) with a learning rate of 1e-3 and weight decay of 1e-6.