Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes

Authors: Yingyi Chen, Qinghua Tao, Francesco Tonin, Johan Suykens

ICML 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments verify our excellent performances and efficiency on in-distribution, distribution-shift and out-of-distribution benchmarks.
Researcher Affiliation Academia 1ESAT-STADIUS, KU Leuven, Belgium 2LIONS, EPFL, Switzerland (most of the work was done at ESAT-STADIUS, KU Leuven).
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes 1Code is at https://github.com/yingyichen-cyy/KEP-SVGP.
Open Datasets Yes We conduct empirical evaluations on benchmarks including i) computer vision: CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009); ii) language modelling: IMDB sentiment analysis (Maas et al., 2011), Co LA linguistic acceptability prediction (Warstadt et al., 2019).
Dataset Splits Yes For both CIFAR-10, CIFAR-100, we randomly split the original training set into 90% training and 10% validation set, leading to a training set of 45K samples and a validation set of 5K. The test set is of 10K samples.
Hardware Specification Yes Comparisons of performance and efficiency on a single NVIDIA Tesla V100 SXM2 32 GB.
Software Dependencies No All experiments presented in this work are implemented with Py Torch.
Experiment Setup Yes For both CIFAR-10, CIFAR-100, we train 7-layer Vision Transformer (Vi T) (Dosovitskiy et al., 2021), optimized by Adam with batch size 128 and a cosine learning rate initialized with 10 3 for 300 epochs.