Concept Drift Detection Through Resampling
Authors: Maayan Harel, Shie Mannor, Ran El-Yaniv, Koby Crammer
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that the method has high recall and precision, and performs well in the presence of noise. |
| Researcher Affiliation | Academia | Maayan Harel MAAYANGA@TX.TECHNION.AC.IL Koby Crammer KOBY@EE.TECHNION.AC.IL Ran El-Yaniv RANI@CS.TECHNION.AC.IL Shie Mannor SHIE@EE.TECHNION.AC.IL Technion Israel Institute of Technology, Haifa, Israel. |
| Pseudocode | Yes | Algorithm 1: Concept Drift Detection Scheme; Algorithm 2: Slow Gradual Drift Detection Scheme.; Algorithm 3: TEST( ˆRord, n ˆRS i(ASi) o P i=1 , , δ) Procedure for detection scheme |
| Open Source Code | No | The paper mentions using 'scikit-learn: Machine Learning in Python toolbox' but does not provide a link or explicit statement about releasing their own implementation code. |
| Open Datasets | Yes | We compare detection performance on a user preference prediction task defined using the 20-news groups text dataset6, consisting of 18, 846 documents and over 75, 000 features. 6http://qwone.com/ jason/20Newsgroups |
| Dataset Splits | Yes | In the first step of our basic detection scheme the observed sub-sequence Zn is divided into a training window Sord = Zk 1 , 1 < k < n, and a test window S ord = Zn k+1. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. |
| Software Dependencies | No | We used scikit-learn: Machine Learning in Python toolbox. Cross-validation on a single random concept showed low sensitivity to the choice of C on this dataset and therefore the default value C = 1 was chosen. The paper mentions 'scikit-learn' but does not specify its version number, nor does it specify version numbers for other key software components. |
| Experiment Setup | Yes | We set the sensitivity level of PERM and grad-PERM to δ = 0.01, = 0, the warning and detection thresholds of STEPD to w = 0.05, d = 0.01, and the parameters of EDDM to α = 0.95 and β = 0.90. The base algorithm was K-Nearest Neighbors (k = 3), each stream was randomly repeated 100 times, and P = 100 reshuffling splits were used in PERM. We use P = 500, and SVM and SVR with linear kernel as the learning algorithms. Cross-validation on a single random concept showed low sensitivity to the choice of C on this dataset and therefore the default value C = 1 was chosen. |