Only Strict Saddles in the Energy Landscape of Predictive Coding Networks?
Authors: Francesco Innocenti, El Mehdi Achour, Ryan Singh, Christopher L Buckley
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on both linear and non-linear networks strongly validate our theory and further suggest that all the saddles of the equilibrated energy are strict. |
| Researcher Affiliation | Academia | Francesco Innocenti School of Engineering and Informatics University of Sussex F.Innocenti@sussex.ac.uk El Mehdi Achour RWTH Aachen University Aachen, Germany achour@mathc.rwth-aachen.de Ryan Singh School of Engineering and Informatics University of Sussex rs773@sussex.ac.uk Christopher L. Buckley School of Engineering and Informatics University of Sussex VERSES c.l.buckley@sussex.ac.uk |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code to reproduce all the experiments is available at https://github.com/francesco-innocenti/pc-saddles. |
| Open Datasets | Yes | We trained DLNs with different number of hidden layers H {2, 5, 10} on standard image classification datasets (MNIST, Fashion-MNIST and CIFAR10). |
| Dataset Splits | No | The paper mentions training networks and observing training loss dynamics but does not explicitly provide information on train/validation/test splits, proportions, or specific methods for data partitioning. |
| Hardware Specification | No | The paper's NeurIPS checklist states: "Most experimental results can be reproduced in a few hours on a CPU, with the exception of those related to Figures 5 & 12 which were run on a GPU (typically A100)." This is not a specific hardware specification for all experiments. |
| Software Dependencies | No | The paper mentions using "standard Euler integration" and a "second-order explicit Runge Kutta ODE solver (Heun)" but does not list specific software libraries or frameworks with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | The following hyperparameters were used for all networks: 300 hidden units and SGD with learning rate η = 1e 3 and batch size b = 64. We used a second-order explicit Runge Kutta ODE solver (Heun) with a maximum upper integration limit T = 300 and an adaptive Proportional-Integral-Derivative controller (absolute and relative tolerances: 1e 3) to ensure convergence of the PC inference dynamics (Eq. 3). All networks were initialised close to the origin Wij N(0, σ2) with σ = 5e 3. |