HyperGAN: A Generative Model for Diverse, Performant Neural Networks
Authors: Neale Ratzlaff, Li Fuxin
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a variety of experiments to test Hyper GAN s ability to achieve both high accuracy and obtain accurate uncertainty estimates. First we show classification performance on both MNIST and CIFAR-10 datasets. |
| Researcher Affiliation | Academia | 1School of Electrical Engineering and Computer Science, Oregon State University. Correspondence to: Neale Ratzlaff <ratzlafn@oregonstate.edu>, Li Fuxin <lif@oregonstate.edu>. |
| Pseudocode | No | The paper provides architectural diagrams (Figure 1) and mathematical formulations, but it does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | First we show classification performance on both MNIST and CIFAR-10 datasets. |
| Dataset Splits | Yes | For MNIST experiments we train a Hyper GAN on the MNIST dataset, and test on the not MNIST dataset: a 10-class set of 28x28 grayscale images depicting the letters A J. In this setting, we want the softmax probabilities on inlier MNIST examples to have minimum entropy a single large activation close to 1. On out-of-distribution data we want to have equal probability across predictions. Similarly, we test our CIFAR-10 model by training on the first 5 classes, and using the latter 5 classes as out of distribution examples. |
| Hardware Specification | No | The paper mentions memory usage and the type of computing unit: 'We trained our Hyper GAN on MNIST using less than 1.5GB of memory on a single GPU, while CIFAR-10 used just 4GB, making Hyper GAN surprisingly scalable.' However, it does not specify the exact model of the GPU or any other hardware components like CPU or RAM. |
| Software Dependencies | No | The paper describes the models and methods used but does not specify any software libraries, frameworks, or their version numbers that were used for implementation or experimentation. |
| Experiment Setup | Yes | For MNIST experiments, Hyper GAN has weight generators, each taking a latent vector z Q(z|s) R128 as input. The target network for the MNIST experiments is a small two layer convolutional network followed by 1 fully-connected layer, using leaky Re LU activations and 2x2 max pooling after each convolutional layer. For CIFAR-10, we use 5 weight generators with latent codes z Q(z|s) R256. The target architecture for CIFAR-10 consists of three convolutional layers, each followed by leaky Re LU and 2x2 max pooling, followed by 2 fully connected layers. The mixer, generators, and discriminator are each 2 layer MLPs with 512 units in each layer and Re LU nonlinearity. |