Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Change Surfaces for Expressive Multidimensional Changepoints and Counterfactual Prediction

Authors: William Herlands, Daniel B. Neill, Hannes Nickisch, Andrew Gordon Wilson

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using two large spatio-temporal datasets we employ GPCS to discover and characterize complex changes that can provide scientific and policy relevant insights. Specifically, we analyze twentieth century measles incidence across the United States and discover previously unknown heterogeneous changes after the introduction of the measles vaccine. Additionally, we apply the model to requests for lead testing kits in New York City, discovering distinct spatial and demographic patterns. Section 4 is titled "Experiments".
Researcher Affiliation Collaboration William Herlands EMAIL ... Carnegie Mellon University, Daniel B. Neill EMAIL ... New York University, Hannes Nickisch EMAIL Digital Imaging Philips Research Hamburg, Andrew Gordon Wilson EMAIL ... Cornell University.
Pseudocode Yes Algorithm 1 Initialize RKS w(x) by optimizing a simplified model with RBF kernels and Algorithm 2 Initialize spectral mixture kernels are presented on pages 17 and 18 respectively.
Open Source Code No The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets Yes We use yearly counts of accidents from Jarrett (1979)., All data were taken from the 2014 American Community Survey 5 year average at the zip code level (Census Bureau, 2014b)., Incidence rates per 100,000 population based on historical population estimates are made publicly available by Project Tycho (van Panhuis et al., 2013).
Dataset Splits No Using synthetic data, we create a predictive test by splitting the data into training and testing sets. This statement is too general and does not provide specific details on the splits (e.g., percentages, sample counts, or methodology).
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, or detailed computer specifications used for running its experiments.
Software Dependencies No The paper mentions using "KISS-GP framework (Wilson and Nickisch, 2015)" and "Gaussian processes for machine learning (gpml) toolbox (Rasmussen and Nickisch, 2010)" but does not specify any software libraries or packages with their version numbers that would be needed to replicate the experiments.
Experiment Setup Yes Therefore, we use m1 = 100 and m2 = 20 for Algorithm 1. Specifically, we let Λ = (range(x)/2)^2, σ0 = std(y), and σn = mean(|y|)/10. For each method we average the results for 10 random restarts.