NAS-X: Neural Adaptive Smoothing via Twisting
Authors: Dieterich Lawson, Michael Li, Scott Linderman
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the theoretical advantages of NAS-X over previous methods and explore these advantages empirically in a variety of tasks, including a challenging application to mechanistic models of neuronal dynamics. These experiments show that NAS-X substantially outperforms previous VIand RWS-based methods in inference and model learning, achieving lower parameter error and tighter likelihood bounds. |
| Researcher Affiliation | Collaboration | Dieterich Lawson*, Google Research dieterichl@google.com Michael Y. Li* Stanford University michaelyli@stanford.edu Scott W. Linderman Stanford University scott.linderman@stanford.edu |
| Pseudocode | Yes | A full description is available in Algorithms 1 and 2. Algorithm 1: NAS-X. Procedure NAS-X(θ0, ϕ0, ψ0, y1:T). Algorithm 2: Twist Training. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | The dataset used to fit the model was a subset of the stimulus/response pairs available from the Allen Institute. For cell 480169178, the criteria above selected 95 stimulus/response pairs... Each trace pair was then downsampled to 1 ms... and corrupted with mean-zero Gaussian noise of variance 20 m V2... Finally, the 95 traces were randomly split into 72 training traces and 23 test traces. [48] Quanxin Wang et al. The Allen mouse brain common coordinate framework: a 3d reference atlas. Cell, 181(4):936 953, 2020. [52] AIBS. Biophysical modeling perisomatic. Technical report, Allen Institute for Brain Science, 10 2017. URL http://help.brain-map.org/display/celltypes/Documentation. |
| Dataset Splits | No | The paper states: 'Finally, the 95 traces were randomly split into 72 training traces and 23 test traces.' It does not explicitly mention a separate validation set split or its size. While it mentions 'early stopping on the train log marginal likelihood lower bound,' this implies using the training data itself or a subset thereof for evaluation, not a distinct validation split. |
| Hardware Specification | No | The paper mentions running experiments 'on a single CPU core with 7 Gb of memory' but does not specify the model or type of the CPU. No other specific hardware (e.g., GPU models, TPU types) is mentioned. |
| Software Dependencies | No | The paper mentions software like 'Adam' [50], 'JAX' [21], and 'TensorFlow' [22] but does not provide specific version numbers for any of these dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | We use a batch size of 32 for the density ratio estimation step. For the twist, we used Adam with a learning rate schedule that starts with a constant learning rate of 1e-3, decays the learning by 0.3 and 0.33 at 100,000 and 300,000 iterations. For the proposal, we used Adam with a constant learning rate of 1e-3. We ran a hyperparameter sweep over learning rates and initial values of the voltage and observation noise variances (270 hyperparameter settings in all), and selected the best performing model via early stopping on the train log marginal likelihood lower bound. |