reproducibilityindex.ai

Learning dynamics of deep linear networks with multiple pathways

Authors: Jianghong Shi, Eric Shea-Brown, Michael Buice

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This result is derived analytically and demonstrated with numerical simulation with both linear and non-linear networks. We demonstrate our results with numerical simulations of networks with two pathways and multiple depths.
Researcher Affiliation	Collaboration	Jianghong Shi Department of Applied Mathematics University of Washington Seattle, WA 98195 jhshi@uw.edu Eric Shea-Brown Department of Applied Mathematics University of Washington Seattle, WA 98195 etsb@uw.edu Michael A. Buice Allen Institute Mind Scope Program Seattle, WA 9109 michaelbu@alleninstitute.org
Pseudocode	No	The paper describes mathematical derivations and simulation procedures through narrative and equations, but it does not include any explicitly labeled pseudocode or algorithm blocks, nor any structured, code-like steps.
Open Source Code	Yes	Code for simulations and ﬁgures is available at https://github.com/Allen Institute/ Multipathway_Neur IPS2022.
Open Datasets	No	The input vectors x are 8-dimensional and are the rows of the 8-dimensional identity matrix. The output vectors y are 15-dimensional and are the rows of the matrix: [matrix provided in paper] (The paper defines the data used for training internally rather than citing or linking to an external public dataset.)
Dataset Splits	No	We train the network with a set of P examples {xi, yi}, i = 1, 2, . . . , P with gradient descent on the squared loss. (The paper does not provide explicit details regarding train/validation/test splits, such as percentages or sample counts, nor does it refer to predefined splits from known datasets.)
Hardware Specification	No	These simulations are not compute intensive and are easily performed on a standard modern desktop or laptop. (This statement is too general and does not provide specific hardware details such as CPU/GPU models, memory, or cloud resources used.)
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their exact versions) that were used for the experiments.
Experiment Setup	Yes	For these examples we use the same number of layers per pathway and N1 = N2 = 1000. The initial state of the weight matrices is drawn from a zero mean normal distribution with a ﬁxed standard deviation σ = 0.01. Gradient descent is performed over 1000 epochs with learning rate lr = 0.01.