Kernel similarity matching with Hebbian networks

Authors: Kyle Luther, Sebastian Seung

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare the approximation quality of our method to neural random Fourier method and variants of the popular but non-biological Nyström method for approximating the kernel matrix. Our method appears to be comparable or better than randomly sampled Nyström methods when the outputs are relatively low dimensional (although still potentially higher dimensional than the inputs) but less faithful when the outputs are very high dimensional. We train networks using a Gaussian kernel for the half moons dataset and a power-cosine kernel (defined in section 4.2) for the MNIST dataset.
Researcher Affiliation Academia Kyle Luther Princeton Neuroscience Institute kluther@princeton.edu H. Sebastian Seung Princeton Neuroscience Institute Princeton Computer Science Department sseung@princeton.edu
Pseudocode No Applying a stochastic gradient descent-ascent algorithm to Eq. (9) yields a neural network (Fig. 1) in which yt i is the response of neuron i to input pattern t, wi is the vector of incoming connections to neuron i from the input layer, qi is a term which modulates the strength of these incoming connections, and Lij is a matrix of lateral recurrent connections between outputs.
Open Source Code Yes We have attached the main portion of the training code, written using Py Torch, in the appendix.
Open Datasets Yes We train our algorithm on a simple half moons dataset (which can be generated with Pedregosa et al. [2011]) and on the MNIST handwritten digits dataset Le Cun [1998].
Dataset Splits No The dataset consists of 70,000 images of 28x28 handwritten digits, which we cropped by 4 pixels on each side to yield 20x20 images (which become 400 dimensional inputs). We train networks with α = {1, 2, 3, 4} and n = 800 neurons (so the output dimensionality is exactly 2x the input dimensionality).
Hardware Specification No The paper does not provide specific details on the hardware used for experimentation.
Software Dependencies No We have attached the main portion of the training code, written using Py Torch, in the appendix. We run KMeans on x and on y (we use the implementation of scikit-learn, and take the lowest energy solution using 100 inits).
Experiment Setup Yes In our experiments λ = 0.001. We use a Gaussian kernel with σ = 0.3 to measure input similarities: f(u, v) = e u v 2 2σ2 . We vary the number of neurons n ∈ {2, 4, 8, 16, 32, 64}. We train every configuration using k ∈ {1, 3, 10, 30, 100, 300, 1000, 3000} labels per class. We train all configurations with a weight decay parameter λ ∈ {1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1} which yields the highest test accuracy.