On Scrambling Phenomena for Randomly Initialized Recurrent Networks
Authors: Vaggos Chatziafratis, Ioannis Panageas, Clayton Sanford, Stelios Stavroulakis
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical findings explain empirically observed behavior of RNNs from prior works, and are also validated in our experiments. Our goal is twofold: Firstly, to demonstrate that Li-Yorke chaos is indeed present across different models and different initialization heuristics, and to empirically estimate some bounds on how often scrambling phenomena appear. |
| Researcher Affiliation | Academia | Vaggos Chatziafratis Department of Computer Science and Engineering University of California, Santa Cruz vaggos@ucsc.edu Ioannis Panageas Department of Computer Science University of California, Irvine ipanagea@ics.uci.edu Clayton Sanford Department of Computer Science Columbia University clayton@cs.columbia.edu Stelios Andrew Stavroulakis Department of Computer Science University of California, Irvine sstavrou@uci.edu |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Our code is made publicly available here: https://github.com/steliostavroulakis/Chaos_RNNs/blob/main/Depth_2_RNNs_and_Chaos_Period_3_Probability_RNNs.ipynb |
| Open Datasets | Yes | In higher dimensions, we used an MNIST dataset as input to a 64-dimensional RNN with 1 hidden layer, width 64, fully connected, with Re LU activation functions and He initialization. |
| Dataset Splits | No | The paper does not provide specific train/validation/test dataset splits. The checklist also states N/A for this question: "Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A]" |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for its experiments. The checklist states N/A for this question: "Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]" |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) in the main text for its experiments. |
| Experiment Setup | Yes | Our table in Fig. 3 summarizes the results across 10000 runs of each experiment. For each experiment, the first layer has width 2, where the weights and biases of each neuron are initialized according to each line, whereas the second layer has width 1 with weight and bias as specified in each line. In higher dimensions, we used an MNIST dataset as input to a 64-dimensional RNN with 1 hidden layer, width 64, fully connected, with Re LU activation functions and He initialization. |