Controversy in mechanistic modelling with Gaussian processes

Authors: Benn Macdonald, Catherine Higham, Dirk Husmeier

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the present article, we offer a new interpretation of the second paradigm, which highlights the underlying assumptions, approximations and limitations. In particular, we show that the second paradigm suffers from an intrinsic identifiability problem, which the first paradigm is not affected by. We complement our theoretical analysis with empirical demonstrations on simulated data, using the same model systems as in the original publications, (Wang & Barber, 2014) and (Dondelinger et al., 2013).
Researcher Affiliation Academia Benn Macdonald B.MACDONALD.1@RESEARCH.GLA.AC.UK School of Mathematics & Statistics, University of Glasgow Catherine Higham CATHERINE.HIGHAM@GLASGOW.AC.UK School of Mathematics & Statistics, University of Glasgow Dirk Husmeier DIRK.HUSMEIER@GLASGOW.AC.UK School of Mathematics & Statistics, University of Glasgow
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Our code can be downloaded from http://tinyurl.com/otus5xq.
Open Datasets No As a first study, we generated noisy data from the simple ODEs of (30), with species 2 missing, using a sample size of N = 20 and an average signal-to-noise ratio of SNR = 10. ... First, N = 11 data points were generated with θ1 = 2, θ2 = 1, θ3 = 4, θ4 = 1. Next, iid Gaussian noise with an average signal-to-noise ratio SNR = 4 was added, and ten independent data sets were generated this way. ... we generated data with the same parameters, α = 0.2, β = 0.2 and ψ = 3, and same initial values, V = 1, R = 1, but making the inference problem harder by reducing the training set size to N = 20, covering the time interval [0, 10]. We emulated noisy measurements by adding iid Gaussian noise with an average signalto-noise ratio SNR = 10, and generated ten independent data instantiations. The paper generates its own simulated data for experiments and does not provide access information to a publicly available dataset.
Dataset Splits No The paper mentions generating data with specific sample sizes (e.g., N=20, N=11, N=20 training set size) but does not specify any explicit training, validation, or test dataset splits or cross-validation methodologies. The data is simulated by the authors for their experiments.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory, cloud resources).
Software Dependencies No The paper states: "all GPODE results were obtained with the original software from (Wang & Barber, 2014). We have also integrated the inference for the AGM model into their software...". It mentions the software used but does not provide specific version numbers for any key software components or libraries.
Experiment Setup Yes For all simulations, we used a squared exponential kernel, and chose a U(5, 50) prior for the length scale and a U(0.1, 1) prior for the amplitude hyperparameters, respectively, as in the paper by (Wang & Barber, 2014). We tried different prior distributions of the ODE parameters, as specified in the figure captions; note that these priors are less informative than those used in (Wang & Barber, 2014). Observational noise was added in the same way as in (Wang & Barber, 2014). We set the noise variance σ2 k equal to the true noise variance, and the mean µk equal to the sample mean.