Change-point Detection for Sparse and Dense Functional Data in General Dimensions

Authors: Carlos Misael Madrid Padilla, Daren Wang, Zifeng Zhao, Yi Yu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive numerical experiments illustrate the effectiveness of FSBS and its advantage over existing methods in the literature under various settings. A real data application is further conducted, where FSBS localises change-points of sea surface temperature patterns in the south Pacific attributed to El Niño.
Researcher Affiliation Academia Carlos Misael Madrid Padilla Department of Mathematics University of Notre Dame cmadridp@nd.edu Daren Wang Department of Statistics University of Notre Dame dwang24@nd.edu Zifeng Zhao Mendoza College of Business University of Notre Dame zzhao2@nd.edu Yi Yu Department of Statistics University of Warwick yi.yu.2@warwick.ac.uk
Pseudocode Yes Algorithm 1 Functional Seeded Binary Segmentation. FSBS ((s, e], h, h, τ)
Open Source Code Yes The code to replicate all of our experiments can be found at https://github.com/cmadridp/FSBS.
Open Datasets Yes We consider the COBE-SSTE dataset [24], which consists of monthly average sea surface temperature (SST) from 1940 to 2019, on a 1 degree latitude by 1 degree longitude grid (48 30) covering Australia. ... [24] Physical Sciences Laboratory [2020], COBE SST2 and Sea-Ice , https://psl.noaa.gov/ data/gridded/data.cobe2.html.
Dataset Splits Yes The tuning parameter τ and the bandwidth h are chosen by cross-validation, with evenly-indexed data being the training set and oddly-indexed data being the validation set.
Hardware Specification No The paper explicitly states: "Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The computation is sufficiently fast and computational cost is not a concern."
Software Dependencies No For the implementation of FSBS, we adopt the Gaussian kernel. Following the standard practice in kernel density estimation, the bandwidth h is selected by the function Hpi in the R package ks ([9]). The paper mentions the R package 'ks' but does not specify its version number.
Experiment Setup Yes The tuning parameter τ and the bandwidth h are chosen by cross-validation, with evenly-indexed data being the training set and oddly-indexed data being the validation set. For each pair of candidate (h, τ), we obtain change-point estimators {bηk} b K k=1 on the training set and compute the validation loss P b K k=1 P t [bηk,bηk+1) Pn i=1{(bηk+1 bηk) 1 Pbηk+1 t=bηk+1 Ft,h(xt,i) yt,i}2. The pair (h, τ) is then chosen to be the one corresponding to the lowest validation loss.