Learning Efficient Coding of Natural Images with Maximum Manifold Capacity Representations
Authors: Thomas Yerxa, Yilun Kuang, Eero Simoncelli, SueYeon Chung
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here, we simplify this measure to a form that facilitates direct optimization, use it to learn Maximum Manifold Capacity Representations (MMCRs), and demonstrate that these are competitive with state-of-the-art results on current self-supervised learning (SSL) recognition benchmarks. Empirical analyses reveal important differences between MMCRs and the representations learned by other SSL frameworks, and suggest a mechanism by which manifold compression gives rise to class separability. Finally, we evaluate a set of SSL methods on a suite of neural predictivity benchmarks, and find MMCRs are highly competitive as models of the primate ventral stream. |
| Researcher Affiliation | Collaboration | Thomas Yerxa 1 Yilun Kuang 2,3 Eero Simoncelli 1,2,3 Sue Yeon Chung1,2 1Center for Neural Science, New York University 2Center for Computational Neuroscience, Flatiron Institute 3Courant Inst. of Mathematical Sciences, tey214@nyu.edu |
| Pseudocode | Yes | B Pytorch Style Pseudocode for MMCR |
| Open Source Code | No | The paper references open-source code for baseline methods (e.g., Mo Co, Barlow Twins, BYOL from solo-learn; SwAV, SimCLR from VISSL), but does not provide a link or explicit statement for the open-sourcing of their own MMCR implementation. |
| Open Datasets | Yes | Table 1: Evaluation of learned features on downstream classification tasks... Image Net (IN)...Food-101 Flowers-102 DTD. Table 4: Top-1 classification accuracies of linear classifiers for representations trained with various datasets... CIFAR-10 CIFAR-100 STL-10 Image Net-100... Appendix M: ...fine tuning the representation network with a Faster R-CNN head and C-4 backbone on the VOC07+12 dataset... |
| Dataset Splits | Yes | Columns 2 and 3 show semi-supervised evaluation on Image Net (fine-tuning on 1% and 10% of labels). ... We also perform semi-supervised evaluation, where all model parameters are fine tuned using a small number of labelled examples... |
| Hardware Specification | Yes | Pre-training on 16 A100 GPUs using 8 views (our most compute intensive setting) takes approximately 32 hours. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer (46)', 'LARS optimizer', 'SGD optimizer', 'detectron2 library', and implies 'PyTorch' through pseudocode, but it does not specify version numbers for these dependencies. |
| Experiment Setup | Yes | For Image Net we used the LARS optimizer with a learning rate of 4.8, linear warmup during the first 10 epochs and cosine decay thereafter with a batchsize of 2048, and pre-train for 100 epochs. ... For smaller CIFAR-10 we used a smaller batch size, many more views (40), and the Adam optimizer with fixed learning rate. ...All models were trained for 500 epochs using the Adam optimizer (46) with a learning rate of 1e-3 and weight decay of 1e-6. |