Learning and Generalization in RNNs
Authors: Abhishek Panigrahi, Navin Goyal
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate our results on some regular language recognition problems. ... We performed few toy experiments on the ability of invertibility for RNNs at initialization (Sec. I). We observed, as predicted by our theorem above, that the error involved in inversion decreases with the number of neurons and increases with the length of the sequence (Fig. 4). ... RNNs perform well on regular language recognition task in our experiments in Sec. I. |
| Researcher Affiliation | Collaboration | Abhishek Panigrahi Department of Computer Science Princeton University ap34@princeton.edu Navin Goyal Microsoft Research India navingo@microsoft.com Work done as a research fellow in Microsoft Research India |
| Pseudocode | No | The paper does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement about releasing open-source code or a link to a code repository. |
| Open Datasets | No | The paper describes generating data from an unknown distribution D and mentions 'regular language recognition problems' for illustration, but it does not provide concrete access information (e.g., links, DOIs, citations with author/year) for any specific public or open datasets used for training. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits, such as percentages or sample counts. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | No | The paper provides theoretical definitions for learning rate (η) and number of steps (T) in terms of other variables (e.g., 'Θ(1 / ερ2m)', 'Θ(p2C2 poly(ρ)ε−2)'), but it does not offer concrete numerical values or detailed system-level training configurations for a practical experimental setup. |