Saddle-to-Saddle Dynamics in Diagonal Linear Networks
Authors: Scott Pesme, Nicolas Flammarion
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide numerical experiments to support our findings. |
| Researcher Affiliation | Academia | Scott Pesme EPFL scott.pesme@epfl.chNicolas Flammarion EPFL nicolas.flammarion@epfl.ch |
| Pseudocode | Yes | Algorithm 1: Successive saddles and jump times of limα 0 βα |
| Open Source Code | No | The paper does not provide any explicit statements about releasing open-source code or links to a code repository. |
| Open Datasets | No | For each experiment we generate our dataset as yi = xi, β where xi = N(0, H) for a a diagonal covariance matrix H and β is a vector of Rd. The only assumption we make on the data throughout the paper is that the inputs (x1, . . . , xn) are in general position. |
| Dataset Splits | No | The paper describes how the data is generated for each experiment, but does not specify training, validation, or test splits. It directly uses the generated data for the numerical experiments. |
| Hardware Specification | No | The paper describes the experimental setup, but does not specify any hardware details (e.g., CPU, GPU, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions that "Gradient descent is run with a small step size" but does not provide specific software names with version numbers for reproducibility. |
| Experiment Setup | Yes | For each experiment we generate our dataset as yi = xi, β where xi = N(0, H) for a a diagonal covariance matrix H and β is a vector of Rd. Gradient descent is run with a small step size and from initialisation u0 = √2α1d and v0 = 0d for some initialisation scale α > 0. Figure 1 and Figure 4 (Left): (n, d, α) = (5, 7, 10−120), H = Id, β = (10, 20, 0, 0, 0, 0, 0) ∈ R7. |