Kernel similarity matching with Hebbian networks
Authors: Kyle Luther, Sebastian Seung
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare the approximation quality of our method to neural random Fourier method and variants of the popular but non-biological Nyström method for approximating the kernel matrix. Our method appears to be comparable or better than randomly sampled Nyström methods when the outputs are relatively low dimensional (although still potentially higher dimensional than the inputs) but less faithful when the outputs are very high dimensional. We train networks using a Gaussian kernel for the half moons dataset and a power-cosine kernel (defined in section 4.2) for the MNIST dataset. |
| Researcher Affiliation | Academia | Kyle Luther Princeton Neuroscience Institute kluther@princeton.edu H. Sebastian Seung Princeton Neuroscience Institute Princeton Computer Science Department sseung@princeton.edu |
| Pseudocode | No | Applying a stochastic gradient descent-ascent algorithm to Eq. (9) yields a neural network (Fig. 1) in which yt i is the response of neuron i to input pattern t, wi is the vector of incoming connections to neuron i from the input layer, qi is a term which modulates the strength of these incoming connections, and Lij is a matrix of lateral recurrent connections between outputs. |
| Open Source Code | Yes | We have attached the main portion of the training code, written using Py Torch, in the appendix. |
| Open Datasets | Yes | We train our algorithm on a simple half moons dataset (which can be generated with Pedregosa et al. [2011]) and on the MNIST handwritten digits dataset Le Cun [1998]. |
| Dataset Splits | No | The dataset consists of 70,000 images of 28x28 handwritten digits, which we cropped by 4 pixels on each side to yield 20x20 images (which become 400 dimensional inputs). We train networks with α = {1, 2, 3, 4} and n = 800 neurons (so the output dimensionality is exactly 2x the input dimensionality). |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for experimentation. |
| Software Dependencies | No | We have attached the main portion of the training code, written using Py Torch, in the appendix. We run KMeans on x and on y (we use the implementation of scikit-learn, and take the lowest energy solution using 100 inits). |
| Experiment Setup | Yes | In our experiments λ = 0.001. We use a Gaussian kernel with σ = 0.3 to measure input similarities: f(u, v) = e u v 2 2σ2 . We vary the number of neurons n ∈ {2, 4, 8, 16, 32, 64}. We train every configuration using k ∈ {1, 3, 10, 30, 100, 300, 1000, 3000} labels per class. We train all configurations with a weight decay parameter λ ∈ {1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1} which yields the highest test accuracy. |