Superposition of many models into one
Authors: Brian Cheung, Alexander Terekhov, Yubei Chen, Pulkit Agrawal, Bruno Olshausen
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments with neural networks, we show that a surprisingly large number of models can be effectively stored within a single parameter instance. We demonstrate the efficacy of our approach of learning via parameter superposition on two separate online image-classification settings: (a) time-varying input data distribution and (b) time-varying output label distribution. |
| Researcher Affiliation | Academia | Brian Cheung Redwood Center, BAIR UC Berkeley bcheung@berkeley.edu Alex Terekhov Redwood Center UC Berkeley aterekhov@berkeley.edu Yubei Chen Redwood Center, BAIR UC Berkeley yubeic@berkeley.edu Pulkit Agrawal BAIR UC Berkeley pulkitag@berkeley.edu Bruno Olshausen Redwood Center, BAIR UC Berkeley baolshausen@berkeley.edu |
| Pseudocode | No | No. The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | No. The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | Permuting MNIST dataset [2], is a variant of the MNIST dataset [7]... rotating-MNIST and rotating-Fashion MNIST that are variants of the original MNIST and Fashion MNIST [21] datasets... The incremental CIFAR (i CIFAR) dataset [16, 22] (see Figure 6a) is a variant of the CIFAR dataset [6]... |
| Dataset Splits | No | No. The paper describes training and testing procedures on various datasets (e.g., Permuting MNIST, Rotating MNIST/Fashion MNIST, iCIFAR), and mentions the use of 'test sets', but it does not explicitly provide specific details for training, validation, AND test dataset splits, such as percentages, sample counts, or explicit references to standard validation splits for all experiments. |
| Hardware Specification | No | No. The paper does not provide specific details on the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | No. The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries). |
| Experiment Setup | Yes | We trained fully-connected networks with two hidden layers on fifty permuting MNIST tasks presented sequentially. The size of hidden layers was varied from 128 to 2048 units. In our setup, a new task is created after every 1000 mini-batches (steps) of training by permuting the image pixels. To show that our PSP method can be used with state-of-the-art neural networks, we used Res Net-18 to first train on CIFAR-10 dataset for 20K steps. Next, we trained the network on 20K steps on four subsequent and disjoint sets of 10 classes chosen from the CIFAR-100 dataset. |