Kernelized Synaptic Weight Matrices

Authors: Lorenz Muller, Julien Martel, Giacomo Indiveri

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the following experiments we demonstrate on some examples, how kernel Nets can be used in practice. We investigate the effect of using different kernel functions (Section 3.1), show how to incorporate prior knowledge into the network parameters (Section 3.2), create extensible data visualizations (Section 3.3, 3.4) and achieve state-of-the-art performance on the Movie Lens dataset (Section 3.5), while reducing the computational complexity of the model.
Researcher Affiliation Academia 1Institute of Neuroinformatics, University of Zurich and ETH Zurich, Switzerland. Correspondence to: Lorenz K. Muller <lorenz@ini.ethz.ch>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It describes mathematical equations and general procedures but not in an algorithm format.
Open Source Code Yes 1code available in supplement. ... 2Code available in the supplement. ... 3Code available in the supplement.
Open Datasets Yes We train our models to predict movie ratings of the Movie Lens-10M (ML-10M), Movie Lens-1M (ML-1M) and Movie Lens-100K (ML-100K) datasets (Harper & Konstan, 2016).
Dataset Splits Yes We randomly designate 10% or 20% respectively of the given ratings as validation data (so chosen to match the models we compare to). The validation data is not used in training and used alone in the reported error computation. Reported performances average over five such random splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running its experiments.
Software Dependencies No The paper mentions 'ADAM learning rule', 'L-BFGS-B', and 'RPROP optimizers' but does not specify software names with version numbers for reproducibility (e.g., Python, PyTorch, TensorFlow versions, or specific library versions).
Experiment Setup Yes All networks were trained using the ADAM learning rule and a range of hyperparameters (learning rate and l2 regularization); for each network the best mean performance over 5 repetitions is shown. ... For the kernel Net we use 푑= 50 and for the sparse fully-connected case 푑= 5. All hidden layers have size 500.