Learning and Generalization in RNNs

Authors: Abhishek Panigrahi, Navin Goyal

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate our results on some regular language recognition problems. ... We performed few toy experiments on the ability of invertibility for RNNs at initialization (Sec. I). We observed, as predicted by our theorem above, that the error involved in inversion decreases with the number of neurons and increases with the length of the sequence (Fig. 4). ... RNNs perform well on regular language recognition task in our experiments in Sec. I.
Researcher Affiliation Collaboration Abhishek Panigrahi Department of Computer Science Princeton University ap34@princeton.edu Navin Goyal Microsoft Research India navingo@microsoft.com Work done as a research fellow in Microsoft Research India
Pseudocode No The paper does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about releasing open-source code or a link to a code repository.
Open Datasets No The paper describes generating data from an unknown distribution D and mentions 'regular language recognition problems' for illustration, but it does not provide concrete access information (e.g., links, DOIs, citations with author/year) for any specific public or open datasets used for training.
Dataset Splits No The paper does not provide specific details on training, validation, or test dataset splits, such as percentages or sample counts.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup No The paper provides theoretical definitions for learning rate (η) and number of steps (T) in terms of other variables (e.g., 'Θ(1 / ερ2m)', 'Θ(p2C2 poly(ρ)ε−2)'), but it does not offer concrete numerical values or detailed system-level training configurations for a practical experimental setup.