Kernelized Cumulants: Beyond Kernel Mean Embeddings

Authors: Patric Bonnier, Harald Oberhauser, Zoltan Szabo

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We argue both theoretically and empirically (on synthetic, environmental, and traffic data analysis) that going beyond degree one has several advantages and can be achieved with the same computational complexity and minimal overhead in our experiments.In this section, we demonstrate the efficiency of the proposed kernel cumulants in two-sample and independence testing.
Researcher Affiliation Academia Patric Bonnier1 Harald Oberhauser 1 Zoltán Szabó2 1Mathematical Institute, University of Oxford 2Department of Statistics, London School of Economics
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code Yes All the code replicating our experiments is available at https://github.com/PatricBonnier/Kernelized-Cumulants.
Open Datasets Yes The Seoul bicycle data set (E et al., 2020) consists of environmental data along with the number of bicycle rentals.We used the Sao Paulo traffic benchmark (Ferreira, 2016) to perform independence testing.
Dataset Splits No Permutation test was applied to approximate the null distribution and its 0.95-quantile (which corresponds to the level choice α = 0.05): We first computed our test statistic S using the given samples (S0 = S), and then permuted the samples 100 times. The paper describes a permutation testing procedure but does not provide specific training, validation, or test dataset splits in terms of percentages or sample counts.
Hardware Specification Yes The experiments were carried out on a laptop with an i7 CPU and 16GBs of RAM.
Software Dependencies No All experiments were performed using the rbf-kernel rbfσ(x, y) = e x y 2 2 2σ2. The paper mentions using an RBF kernel and implies Python for the code, but it does not provide specific version numbers for any key software components or libraries.
Experiment Setup Yes All experiments were performed using the rbf-kernel rbfσ(x, y) = e x y 2 2 2σ2 , where the parameter σ is called the bandwidth. We performed all experiments for every bandwidth of the form σ = a10b where a = 1, 2.5, 5, 7.5 and b = 5, 4, 3, 2, 1, 0 and the optimal value across the bandwidths was chosen for each method and sample size.