Modular Networks: Learning to Decompose Neural Computation
Authors: Louis Kirsch, Julius Kunze, David Barber
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply modular networks both to image recognition and language modeling tasks, where we achieve superior performance compared to several baselines. Introspection reveals that modules specialize in interpretable contexts. |
| Researcher Affiliation | Academia | Louis Kirsch Department of Computer Science University College London mail@louiskirsch.com Julius Kunze Department of Computer Science University College London juliuskunze@gmail.com David Barber Department of Computer Science University College London david.barber@ucl.ac.uk now affiliated with IDSIA, The Swiss AI Lab (USI & SUPSI) |
| Pseudocode | Yes | Algorithm 1 Training modular networks with generalized EM |
| Open Source Code | Yes | A library to use modular layers in Tensor Flow can be found at http://louiskirsch.com/libmodular. |
| Open Datasets | Yes | We use the Penn Treebank2 dataset, consisting of 0.9 million words with a vocabulary size of 10,000. (Footnote 2: http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz) and We applied our method to image classification on CIFAR10 [13] |
| Dataset Splits | No | No specific train/validation/test dataset splits (e.g., percentages, sample counts, or explicit mention of validation sets) are provided in the main text. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned in the paper. |
| Software Dependencies | No | The paper mentions "Tensor Flow" in the context of their library, but does not provide specific version numbers for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | Except if noted otherwise, we use a controller consisting of a linear transformation followed by a softmax function for each of the K modules to select. Our modules are either linear transformations or convolutions, followed by a Re LU activation. Additional experimental details are given in the supplementary material. |