On Scrambling Phenomena for Randomly Initialized Recurrent Networks

Authors: Vaggos Chatziafratis, Ioannis Panageas, Clayton Sanford, Stelios Stavroulakis

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical findings explain empirically observed behavior of RNNs from prior works, and are also validated in our experiments. Our goal is twofold: Firstly, to demonstrate that Li-Yorke chaos is indeed present across different models and different initialization heuristics, and to empirically estimate some bounds on how often scrambling phenomena appear.
Researcher Affiliation Academia Vaggos Chatziafratis Department of Computer Science and Engineering University of California, Santa Cruz vaggos@ucsc.edu Ioannis Panageas Department of Computer Science University of California, Irvine ipanagea@ics.uci.edu Clayton Sanford Department of Computer Science Columbia University clayton@cs.columbia.edu Stelios Andrew Stavroulakis Department of Computer Science University of California, Irvine sstavrou@uci.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Our code is made publicly available here: https://github.com/steliostavroulakis/Chaos_RNNs/blob/main/Depth_2_RNNs_and_Chaos_Period_3_Probability_RNNs.ipynb
Open Datasets Yes In higher dimensions, we used an MNIST dataset as input to a 64-dimensional RNN with 1 hidden layer, width 64, fully connected, with Re LU activation functions and He initialization.
Dataset Splits No The paper does not provide specific train/validation/test dataset splits. The checklist also states N/A for this question: "Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A]"
Hardware Specification No The paper does not explicitly describe the hardware used for its experiments. The checklist states N/A for this question: "Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]"
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) in the main text for its experiments.
Experiment Setup Yes Our table in Fig. 3 summarizes the results across 10000 runs of each experiment. For each experiment, the first layer has width 2, where the weights and biases of each neuron are initialized according to each line, whereas the second layer has width 1 with weight and bias as specified in each line. In higher dimensions, we used an MNIST dataset as input to a 64-dimensional RNN with 1 hidden layer, width 64, fully connected, with Re LU activation functions and He initialization.