A nonparametric method for gradual change problems with statistical guarantees
Authors: Lizhen Nie, Dan Nicolae
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Simulations To better understand finite sample properties of the proposed method, we evaluate its performance in simulations and against baselines. Additional details and results including type I error (p-value calibration), power comparison and performance comparison on strings are included in the Appendix. [...] 6 Real Data Applications Different from most machine learning tasks, there are currently no benchmarking dataset with human annotations for gradual CPD. Thus, we consider the applications introduced in Section 1, and compare our result with known external events and/or other CPD estimators. |
| Researcher Affiliation | Academia | Lizhen Nie Department of Statistics The University of Chicago lizhen@statistics.uchicago.edu Dan Nicolae Department of Statistics The University of Chicago nicolae@statistics.uchicago.edu |
| Pseudocode | Yes | All steps of the proposed procedure are summarized in Algorithm 1 in Section A of the Appendix. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | The Central England Temperature (CET) record (Parker et al., 1992) under Open Government License is the oldest temperature record worldwide and is a valuable source for studying climate change." and "The S&P 500 is a stock market index which tracks the stock of 500 large US companies and is usually used as a benchmark of the overall market. We investigate the daily return data of the S&P 500 index1 in two periods... 1S&P Dow Jones Indices LLC, S&P 500 [SP500], retrieved from https://finance.yahoo.com/quote/%5EGSPC/history/. |
| Dataset Splits | No | For one-side we tune the bandwidth on 20 independently generated datasets among {0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 5}. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper mentions various methods and models but does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Detailed setup. Setting I (main experiment): We set T = 600. For one-side we tune the bandwidth on 20 independently generated datasets among {0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 5}. For each dataset, for fairness we use the same kernel for , and KCp A, and use its corresponding distance for Q, Zw and function class F for gen. For location model, F = {f : x x i, i = 1, , d}; for network model, F = {f : x x ij, i, j = 1, , 10}; for volatility model, F = {f : x x2}. For poly we set the polynomial to the true degree if the polynomial model is correct, and 1 otherwise. As recommended by their authors, we use a granularity of 20 for mix and minimum spanning tree to construct the binary graph for Zw. Threshold for gen is computed using strategy described in Section 6 of Vogt and Dette (2015). |