Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms
Authors: Surbhi Goel, Sham Kakade, Adam Kalai, Cyril Zhang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our architecture combines both recurrent weight sharing between layers and convolutional weight sharing to reduce the number of parameters down to a constant, even though the network itself may have trillions of nodes. The primary limitation of this work is that the constant factors in our analysis are much too large to be meaningful in practice. Nonetheless, it does suggest that the ingredients used in the architecture, especially the combination of recurrent weight-sharing across layers and convolutional weight-sharing within layers, may be useful in designing practical architectures for NNs to learn algorithms. The Ethics Review section also states 'N/A' for running experiments, confirming its theoretical nature. |
| Researcher Affiliation | Collaboration | Surbhi Goel Microsoft Research & University of Pennsylvania surbhig@cis.upenn.edu Sham Kakade Harvard University sham@seas.harvard.edu Adam Tauman Kalai Microsoft Research adam@kal.ai Cyril Zhang Microsoft Research cyrilzhang@microsoft.com |
| Pseudocode | Yes | Algorithm 1 SGD on randomly initialized RCNN |
| Open Source Code | No | The Ethics Review section includes the question: 'Did you include the code, data, and instructions needed to reproduce the main experi- mental results (either in the supplemental material or as a URL)?' to which the answer is '[N/A]', indicating no code is provided. |
| Open Datasets | No | The paper is theoretical and does not report on experiments using a specific dataset. The 'Ethics Review' section indicates 'N/A' for questions related to data and experiments. |
| Dataset Splits | No | The paper is theoretical and does not report on experimental dataset splits. The 'Ethics Review' section indicates 'N/A' for questions related to experiments. |
| Hardware Specification | No | The paper is theoretical and does not report on experiments that would require specific hardware. The 'Ethics Review' section explicitly states 'N/A' for compute resources used. |
| Software Dependencies | No | The paper is theoretical and does not report on experiments that would require specific software dependencies with version numbers. It mentions PyTorch as a modern library for describing the architecture, but not as a dependency for empirical work. |
| Experiment Setup | No | The paper is theoretical and does not provide details about an experimental setup, hyperparameters, or system-level training settings. The 'Ethics Review' section indicates 'N/A' for running experiments. |