Nyström Kernel Mean Embeddings

Authors: Antoine Chatalic, Nicolas Schreuder, Lorenzo Rosasco, Alessandro Rudi

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a proof of work in a simple experimental setting, but extending these results to broader families of datasets and kernel types would be interesting in the future. We first generate data according to a Gaussian mixture... We then perform experiments with data from the Fasttext (Bojanowski et al. 2016) (english features), FMA (Defferrard et al. 2016) (MFCC features), Intel Lab and Gowalla (Cho et al. 2011) datasets...
Researcher Affiliation Academia 1Ma LGA & DIBRIS, Universit a di Genova 2Inria, Ecole normale sup erieure, PSL Research University 3CBMM, MIT, IIT.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any information about open-source code for the described methodology.
Open Datasets Yes We then perform experiments with data from the Fasttext (Bojanowski et al. 2016) (english features), FMA (Defferrard et al. 2016) (MFCC features), Intel Lab and Gowalla (Cho et al. 2011) datasets... https://fasttext.cc/docs/en/ english-vectors.html https://github.com/mdeff/fma http://db.csail.mit.edu/labdata/labdata. html https://snap.stanford.edu/data/ loc-gowalla.html
Dataset Splits No For each dataset, we consider ρ to be the uniform distribution over these points, and we build the empirical estimator using a random sample of size n = 104. The paper specifies sample size but does not provide details on specific training, validation, or test splits, percentages, or absolute counts for these subsets.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes the standard deviation σk of the kernel is chosen to be the median of the inter-points distance, estimated for efficiency on a random subset of 1000 points.