Detecting Abrupt Changes in Sequential Pairwise Comparison Data

Authors: Wanshan Li, Alessandro Rinaldo, Daren Wang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we study the numerical performance of our newly proposed method based on a combination of dynamic programming with local refinement, which we will refer to as DPLR; see Algorithms 1 and 2. We note that the detection of multiple change points in pairwise comparison data has not been studied before, as Höhle (2010) only focus on single change point detection for pairwise comparison data, so we are not aware of any existing competing methods in the literature. Thus, we develop a potential competitor based on the combination of Wild Binary Segmentation (WBS) (Fryzlewicz, 2014), a popular method for univariate change point detection, and the likelihood ratio approach studied in Höhle (2010). We will call this potential competitor WBS-GLR (GLR stands for generalized likelihood ratio). Due to the limit of space, we include the detail of WBS-GLR in Appendix A.1, and results of additional experiments in Appendix A.2, where additional settings are considered. Furthermore, we discuss and compare the performance of two other potential competitors in Appendix A.4.
Researcher Affiliation Academia Wanshan Li Department of Statistics & Data Science Carnegie Mellon University wanshanl@andrew.cmu.edu Daren Wang Department of ACMS University of Notre Dame dwang24@nd.edu Alessandro Rinaldo Department of Statistics & Data Science Carnegie Mellon University arinaldo@cmu.edu
Pseudocode Yes Algorithm 1: Dynamic Programming. DP ({(x(t), yt)}t [T ], γ) ... Algorithm 2: Local Refinement. INPUT: Data {(x(t), yt)}t [T ], {eηk}k [ e K], (eη0, eη e K+1) (1, T).
Open Source Code Yes All of our reproducible code is openly accessible 2 Code repository: https://github.com/Mount Lee/CPD_BT
Open Datasets Yes We study the game records of the National Basketball Association (NBA) 3. Usually a regular NBA season begins in October and ends in April of the next year, so in what follows, a season is named by the two years it spans over. The original data contains all game records of NBA from season 1946-1947 to season 2015-2016. We focus on a subset of 24 teams founded before 1990 and seasons from season 1980-1981 to season 2015-2016. All code of analysis is available online with the data 4. 3 https://gist.github.com/masterofpun/2508ab845d53add72d2baf6a0163d968
Dataset Splits Yes For both methods, we use cross-validation to choose the tuning parameter γ. Given the sequential pairwise comparison data in each trial, we use samples with odd time indices as training data and even time indices as test data. For each tuning parameter, the method is applied to the training data to get estimates of change points. Then a BTL model is fitted to the test data for each interval determined by the estimated change points. The tuning parameter and the corresponding change point estimators with the minimal test error (negative loglikelihood) are selected.
Hardware Specification Yes Each experiment is run on a virtual machine of Google Colab with Intel(R) Xeon(R) CPU of 2 cores 2.30 GHz and 12GB RAM.
Software Dependencies No The paper mentions 'For the constrained MLE in Equation (3.1), we use the function in sklearn for fitting the ℓ2penalized logistic regression,' but does not provide specific version numbers for sklearn or other software dependencies.
Experiment Setup Yes For the constrained MLE in Equation (3.1), we use the function in sklearn for fitting the ℓ2penalized logistic regression, as it is well-known that the constrained and the penalized estimators for generalized linear models are equivalent. For both DPLR and WBS-GLR, we use λ = 0.1. For M, the number of random intervals in WBS-GLR, we set it to be 50 as a balance of time and accuracy.