Neural optimal feedback control with local learning rules

Authors: Johannes Friedrich, Siavash Golkar, Shiva Farashahi, Alexander Genkin, Anirvan Sengupta, Dmitri Chklovskii

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we look at three different experiments to demonstrate various features of the Bio-OFC algorithm. In Sec. 4.1, we look at a discrete-time double integrator and discuss how our approach performs not only Kalman filtering (learning the optimal Kalman gain), but also full system-ID in the open-loop setting (system-ID followed by control) as well as in the more challenging closed-loop setting (simultaneous system-ID and control). In each case, we provide quantitative comparisons and discuss the effect of increased delay. In Secs. 4.2 and 4.3, we apply our methodology to two biologically relevant control tasks, that of reaching movements and flight.
Researcher Affiliation Collaboration Johannes Friedrich 1 Siavash Golkar 1 Shiva Farashahi 1 Alexander Genkin 5 Anirvan M. Sengupta 2,3,4 Dmitri B. Chklovskii 1,5 1 Center for Computational Neuroscience, Flatiron Institute 2 Center for Computational Mathematics, Flatiron Institute 3 Center for Computational Quantum Physics, Flatiron Institute 4 Department of Physics and Astronomy, Rutgers University 5 Neuroscience Institute, NYU Medical Center
Pseudocode No The paper describes algorithms and their steps (e.g., online algorithm combining Kalman filtering with policy gradient, stochastic gradients for updates) but does not provide a formally labeled "Pseudocode" or "Algorithm" block.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See Supplementary Material.
Open Datasets Yes To test the performance of our network, we simulate Bio-OFC in episodic (finite horizon) tasks (e.g., a discrete-time double integrator model, a hand reaching task [1], and a simplified fly simulation). and For our final example, we designed an Open AI gym [32] environment which simulates winged flight in 2-d with simplified dynamics (cf. Fig. S15).
Dataset Splits No The paper describes training over a number of episodes (e.g., "5000 episodes", "10000 episodes") and reports mean and SEM over multiple runs, but it does not specify explicit training/validation/test dataset splits (e.g., percentages or counts) or refer to standard predefined splits.
Hardware Specification No The main text of the paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or detailed computer specifications) used for running its experiments. The paper's checklist indicates such information is in the supplementary material, which is not provided.
Software Dependencies No The paper mentions software like 'Optuna' [29] and 'Open AI gym' [32] but does not provide specific version numbers for these or other software components.
Experiment Setup Yes Learning rates that quickly yield good final performance were chosen by minimizing the sum of average reward during and after learning using Optuna [29], a hyperparameter optimization framework freely available under the MIT license. Different noise levels σ and momenta m are considered in Supplementary Figs. S6 and S7 respectively. and We initialized A and B with small random numbers drawn from N(0, 0.01). and We initialized the weights of our network to the values that are optimal in a null force field, using a unit time of 10 ms and a sensory delay of 50 ms (i.e. τ = 5), as has been measured experimentally [30].