NAS-X: Neural Adaptive Smoothing via Twisting

Authors: Dieterich Lawson, Michael Li, Scott Linderman

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the theoretical advantages of NAS-X over previous methods and explore these advantages empirically in a variety of tasks, including a challenging application to mechanistic models of neuronal dynamics. These experiments show that NAS-X substantially outperforms previous VIand RWS-based methods in inference and model learning, achieving lower parameter error and tighter likelihood bounds.
Researcher Affiliation Collaboration Dieterich Lawson*, Google Research dieterichl@google.com Michael Y. Li* Stanford University michaelyli@stanford.edu Scott W. Linderman Stanford University scott.linderman@stanford.edu
Pseudocode Yes A full description is available in Algorithms 1 and 2. Algorithm 1: NAS-X. Procedure NAS-X(θ0, ϕ0, ψ0, y1:T). Algorithm 2: Twist Training.
Open Source Code No The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes The dataset used to fit the model was a subset of the stimulus/response pairs available from the Allen Institute. For cell 480169178, the criteria above selected 95 stimulus/response pairs... Each trace pair was then downsampled to 1 ms... and corrupted with mean-zero Gaussian noise of variance 20 m V2... Finally, the 95 traces were randomly split into 72 training traces and 23 test traces. [48] Quanxin Wang et al. The Allen mouse brain common coordinate framework: a 3d reference atlas. Cell, 181(4):936 953, 2020. [52] AIBS. Biophysical modeling perisomatic. Technical report, Allen Institute for Brain Science, 10 2017. URL http://help.brain-map.org/display/celltypes/Documentation.
Dataset Splits No The paper states: 'Finally, the 95 traces were randomly split into 72 training traces and 23 test traces.' It does not explicitly mention a separate validation set split or its size. While it mentions 'early stopping on the train log marginal likelihood lower bound,' this implies using the training data itself or a subset thereof for evaluation, not a distinct validation split.
Hardware Specification No The paper mentions running experiments 'on a single CPU core with 7 Gb of memory' but does not specify the model or type of the CPU. No other specific hardware (e.g., GPU models, TPU types) is mentioned.
Software Dependencies No The paper mentions software like 'Adam' [50], 'JAX' [21], and 'TensorFlow' [22] but does not provide specific version numbers for any of these dependencies, which is required for reproducibility.
Experiment Setup Yes We use a batch size of 32 for the density ratio estimation step. For the twist, we used Adam with a learning rate schedule that starts with a constant learning rate of 1e-3, decays the learning by 0.3 and 0.33 at 100,000 and 300,000 iterations. For the proposal, we used Adam with a constant learning rate of 1e-3. We ran a hyperparameter sweep over learning rates and initial values of the voltage and observation noise variances (270 hyperparameter settings in all), and selected the best performing model via early stopping on the train log marginal likelihood lower bound.