Biological learning in key-value memory networks
Authors: Danil Tyulmankov, Ching Fang, Annapurna Vadaparty, Guangyu Robert Yang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compute the accuracy as a function of the number of stored stimuli (Figure 2a). Figure 3a shows the performance of the meta-learned algorithms in comparison to the corresponding simplified versions (section 2) for either sequential and random local third factors. |
| Researcher Affiliation | Academia | Danil Tyulmankov Columbia University dt2586@columbia.edu Ching Fang Columbia University ching.fang@columbia.edu Annapurna Vadaparty Columbia University Stanford University apvadaparty@gmail.com Guangyu Robert Yang Columbia University Massachusetts Institute of Technology yanggr@mit.edu |
| Pseudocode | No | The paper describes the model equations and update rules (e.g., Eq. 1-17) in mathematical notation and prose, but does not include a distinct pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any statement regarding the release of open-source code for the described methodology, nor does it include links to a code repository. |
| Open Datasets | No | The paper states that stimuli are generated, e.g., 'All experiments use random uncorrelated binary random memories xt 2 {+1, 1}d unless otherwise stated.' and 'For the correlated memories (Fig 4c), we generate stimuli by picking a template pattern at random, and for each subsequent pattern we flip each bit with some probability.' It does not refer to a publicly available dataset with specific access information or citations. |
| Dataset Splits | No | The paper describes properties of the input stimuli, such as 'The network stores a set of T stimuli {xt} and the query ex is a corrupted version of a stored key (60% of the entries are randomly set to zero)', and mentions 'Training data consists of sequence lengths between T = N/2 and T = 2N'. However, it does not specify explicit training, validation, or test dataset splits in terms of percentages or counts, or refer to standard predefined splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud resources) used to run the experiments. |
| Software Dependencies | No | The paper mentions using 'Adam' for optimization but does not provide specific version numbers for any software dependencies, programming languages, or libraries used for implementation. |
| Experiment Setup | Yes | optimizing these parameters using stochastic gradient descent (Adam, [Kingma and Ba, 2014]). Empirically, we find that p 4/N produces desirable performance. a stronger global third factor (qt = 10). we introduce an empirically chosen synaptic decay parameter λ = 0.95 to the Hopfield network. (d = N = 40, from Figure 2 caption). |