reproducibilityindex.ai

A Wild Bootstrap for Degenerate Kernel Tests

Authors: Kacper P Chwialkowski, Dino Sejdinovic, Arthur Gretton

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments, the wild bootstrap gives strong performance on synthetic examples, on audio data, and in performance benchmarking for the Gibbs sampler. Our tests outperform both the naive approach which neglects the dependence structure within the samples, and the approach of [4], when testing across multiple lags.
Researcher Affiliation	Academia	Kacper Chwialkowski Department of Computer Science University College London London, Gower Street, WC1E 6BT kacper.chwialkowski@gmail.com Dino Sejdinovic Gatsby Computational Neuroscience Unit, UCL 17 Queen Square, London WC1N 3AR dino.sejdinovic@gmail.com Arthur Gretton Gatsby Computational Neuroscience Unit, UCL 17 Queen Square, London WC1N 3AR arthur.gretton@gmail.com
Pseudocode	No	The paper describes methods mathematically but does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/kacper Chwialkowski/ wild Bootstrap.
Open Datasets	No	The paper synthesises or defines the data generation processes (e.g., 'synthesise the sounds', 'Extinct Gaussian autoregressive process', 'process sampled according to the dynamics proposed by [4]') rather than using pre-existing public datasets with explicit access information.
Dataset Splits	No	The paper discusses sample sizes for experiments (e.g., 'sample size=500', 'sample sizes are (nx, ny) = {(300, 200), (600, 400), (900, 600)}', 'n is the sample size') but does not specify explicit train/validation/test dataset splits as typically found in machine learning model development.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or library version numbers required for replication.
Experiment Setup	Yes	MCMC: sample size=500; a Gaussian kernel with bandwidth σ = 1.7 is used; every second Gibbs sample is kept (i.e., after a pass through both dimensions). Audio: sample sizes are (nx, ny) = {(300, 200), (600, 400), (900, 600)}; a Gaussian kernel with bandwidth σ = 14 is used. Both: wild bootstrap uses blocksize of ln = 20; averaged over at least 200 trials. In lag-HSIC, the number of lags under examination was equal to max{10, log n}, where n is the sample size. We used Gaussian kernels with widths estimated by the median heuristic. The cumulative distribution of the V -statistics was approximated by samples from n Vb2. To model the tail of this distribution, we have ﬁtted the generalized Pareto distribution to the bootstrapped samples.