QuantTree: Histograms for Change Detection in Multivariate Data Streams
Authors: Giacomo Boracchi, Diego Carrera, Cristiano Cervellera, Danilo Macciò
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the proposed histograms are very effective in detecting changes in high dimensional data streams, and that the resulting thresholds can effectively control the false positive rate, even when the number of training samples is relatively small. |
| Researcher Affiliation | Academia | 1Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy. 2Institute of Intelligent Systems for Automation, National Research Council, Genova, Italy. |
| Pseudocode | Yes | Algorithm 1 Quant Tree |
| Open Source Code | Yes | 1The implementation of Quant Tree is available at http:// home.deib.polimi.it/boracchi/Projects |
| Open Datasets | Yes | We also employ four real-world high-dimensional sets: Mini Boo NE particle identification ( particle , d = 50), Physicochemical Properties of Protein Tertiary Structure ( protein , d = 9), Sensorless Drive Diagnosis ( sensorless , d = 48) from the UCI Machine Learning Repository (Lichman, 2013), and Credit Card Fraud Detection ( credit , d = 29) from (Dal Pozzolo et al., 2015). |
| Dataset Splits | No | The paper mentions a 'training set TR' and evaluates on 'batches W' but does not specify a distinct validation set or detailed train/validation/test splits, nor does it refer to cross-validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers required to replicate the experiments. |
| Experiment Setup | Yes | We consider a small TR configuration, where N = 4096 and ν = 64, and a large TR configuration, where N = 16384 and ν = 256. Both configurations have been tested with a number of bins K = 32 and K = 128, leading to 4 different combinations (N, ν, K). In all our experiments, the target FPR has been set to α = 0.05. |