Superposition of many models into one

Authors: Brian Cheung, Alexander Terekhov, Yubei Chen, Pulkit Agrawal, Bruno Olshausen

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments with neural networks, we show that a surprisingly large number of models can be effectively stored within a single parameter instance. We demonstrate the efficacy of our approach of learning via parameter superposition on two separate online image-classification settings: (a) time-varying input data distribution and (b) time-varying output label distribution.
Researcher Affiliation Academia Brian Cheung Redwood Center, BAIR UC Berkeley bcheung@berkeley.edu Alex Terekhov Redwood Center UC Berkeley aterekhov@berkeley.edu Yubei Chen Redwood Center, BAIR UC Berkeley yubeic@berkeley.edu Pulkit Agrawal BAIR UC Berkeley pulkitag@berkeley.edu Bruno Olshausen Redwood Center, BAIR UC Berkeley baolshausen@berkeley.edu
Pseudocode No No. The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No No. The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets Yes Permuting MNIST dataset [2], is a variant of the MNIST dataset [7]... rotating-MNIST and rotating-Fashion MNIST that are variants of the original MNIST and Fashion MNIST [21] datasets... The incremental CIFAR (i CIFAR) dataset [16, 22] (see Figure 6a) is a variant of the CIFAR dataset [6]...
Dataset Splits No No. The paper describes training and testing procedures on various datasets (e.g., Permuting MNIST, Rotating MNIST/Fashion MNIST, iCIFAR), and mentions the use of 'test sets', but it does not explicitly provide specific details for training, validation, AND test dataset splits, such as percentages, sample counts, or explicit references to standard validation splits for all experiments.
Hardware Specification No No. The paper does not provide specific details on the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No No. The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries).
Experiment Setup Yes We trained fully-connected networks with two hidden layers on fifty permuting MNIST tasks presented sequentially. The size of hidden layers was varied from 128 to 2048 units. In our setup, a new task is created after every 1000 mini-batches (steps) of training by permuting the image pixels. To show that our PSP method can be used with state-of-the-art neural networks, we used Res Net-18 to first train on CIFAR-10 dataset for 20K steps. Next, we trained the network on 20K steps on four subsequent and disjoint sets of 10 classes chosen from the CIFAR-100 dataset.