Dense Associative Memory Through the Lens of Random Features
Authors: Benjamin Hoover, Duen Horng Chau, Hendrik Strobelt, Parikshit Ram, Dmitry Krotov
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Empirical evaluation; Figure 3: Dr DAM produces better approximations to the energies and gradients of Mr DAM when the queries are closer to the stored patterns.; Figure 4: A) Retrieval errors predictably follow the approximation quality of fig. 3. |
| Researcher Affiliation | Collaboration | Benjamin Hoover IBM Research & Georgia Tech benjamin.hoover@ibm.com Duen Horng Chau Georgia Tech polo@gatech.edu Hendrik Strobelt IBM Research & MIT-IBM hendrik.strobelt@ibm.com Parikshit Ram IBM Research parikshit.ram@ibm.com Dmitry Krotov IBM Research krotov@ibm.com |
| Pseudocode | Yes | Algorithm 1: Procedures for Dr DAM with random features. |
| Open Source Code | Yes | Experimental code with instructions to replicate the results in this paper are made available at this Git Hub repository (https://github.com/bhoov/distributed_DAM), complete with instructions to setup the coding environment and run all experiments. |
| Open Datasets | Yes | Comparing energy descent dynamics between Dr DAM and Mr DAM on 3x64x64 images from Tiny Imagenet [11].; We stored K = 10 random images from CIFAR10 [43] into the memory matrix of Mr DAM |
| Dataset Splits | No | We generated 2K = 1000 unique, binary patterns (where each value is normalized to be {0, 1^D}) and stored K = 500 of them into the memory matrix Ξ of Mr DAM. ... The remaining patterns are treated as the random queries xb far... Finally, in addition to evaluating the energy at these random queries and at the stored patterns, we also want to evaluate the energy at queries xb near that are near the stored patterns; thus, we take each stored pattern ξµ and perform bit-flips on 0.1D of its entries. This describes data generation and query types, but not a typical train/validation split. |
| Hardware Specification | Yes | All experiments are performed on a single L40s GPU equipped with 46GB VRAM. |
| Software Dependencies | No | Experiments were written and performed using the JAX [47] library for tensor manipulations. (JAX is mentioned but no version number is given) |
| Experiment Setup | Yes | In performing the qualitative reconstructions shown in fig. 1, we used a standard Mr DAM energy (eq. (7)) configured with inverse temperature β = 60. We approximated this energy in a Dr DAM using the trigonometric Sin Cos basis function shown in eq. (8) configured with feature dimension Y = 1.8e5. The four images shown were selected from the Tiny Imagenet [11] dataset, rasterized into a vector, and stored in the memory matrix a Mr DAM, resulting in a memory of shape (4, 12288). Energy descent for both Mr DAM and Dr DAM used standard gradient descent at a step size of 0.1 until the dynamics of all images converged (for fig. 1 after 300 steps, see energy traces). |