reproducibilityindex.ai

Learning deep kernels for exponential family densities

Authors: Li Wenliang, Danica J. Sutherland, Heiko Strathmann, Arthur Gretton

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	in empirical studies, deep maximum-likelihood models can yield higher likelihoods, while our approach gives better estimates of the gradient of the log density, the score, which describes the distribution s shape.
Researcher Affiliation	Academia	1Gatsby Computational Neuroscience Unit, University College London, London, U.K..
Pseudocode	Yes	Algorithm 1: Full training procedure input: Dataset D; initial inducing points z, kernel parameters w, regularization λ = (λα, λC) Split D into D1 and D2; Optimize w, λ, z, and maybe q0 params: while ˆJ( pkw α(λ,kw,z,D1),z, D2) still improving do Sample disjoint data subsets Dt, Dv D1; f( ) = PM m=1 αm(λ, kw, z, Dt)kw(zm, ); ˆJ = 1 \|Dv\| P\|Dv\| n=1 PD d=1 2 df(xn) + 1 2( df(xn))2 ; Take SGD step in ˆJ for w, λ, z, maybe q0 params; end Optimize λ for ﬁtting on larger batches: while ˆJ( pkw α(λ,kw,z,D1),z, D2) still improving do f( ) = PM m=1 αm(λ, kw, z, D1)kw( , zm); Sample subset Dv D2; ˆJ = 1 \|Dv\| P\|Dv\| n=1 PD d=1 2 df(xn) + 1 2( df(xn))2 ; Take SGD steps in ˆJ for λ only; end Finalize α on D1: Find α = α(λ, kw, z, D1); return: log p( ) = PM m=1 αmkw( , zm) + log q0( );
Open Source Code	Yes	Code for DKEF is at github.com/kevin-w-li/deep-kexpfam.
Open Datasets	Yes	we trained DKEF and the likelihoodbased models on ﬁve UCI datasets (Dheeru & Karra Taniskidou, 2017); in particular, we used Red Wine, White Wine, Parkinson, Hep Mass, and Mini Boone.
Dataset Splits	Yes	Split D into D1 and D2; Optimize w, λ, z, and maybe q0 params: while ˆJ( pkw α(λ,kw,z,D1),z, D2) still improving do Sample disjoint data subsets Dt, Dv D1;. Also, from Section 3: 'We can avoid this problem and additionally find the best values for the regularization weights λ with a form of meta-learning. We find choices for the kernel and regularization which will give us a good value of ˆJ on a validation set Dv when fit to a fresh training set Dt.'
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running experiments were provided.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CPLEX 12.4) were listed. The paper mentions 'TensorFlow operations' but does not specify a version.
Experiment Setup	Yes	For the models above, we use layers of width 30 for experiments on synthetic data, and 100 for benchmark datasets. Larger values did not improve performance. ... DKEF. On synthetic datasets, we consider four variants of our model with one kernel component, R = 1. ... DKEF-G-15 has the kernel (7), with L = 3 layers of width W = 15. DKEF-G-50 is the same with W = 50. ... In all experiments, q0(x) = QD d=1 exp \|xd µd\|βd/(2σ2 d) , with βd > 1. On benchmark datasets, we use DKEF-G-50 and KEF-G with three kernel components, R = 3. ... Note that we trained DKEF while adding Gaussian noise with standard deviation 0.05 to the (whitened) dataset