Context-Aware Drift Detection

Authors: Oliver Cobb, Arnaud Van Looveren

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We then provide an empirical study demonstrating its effectiveness for various drift detection problems of practical interest, such as detecting drift in the distributions underlying subpopulations of data in a manner that is insensitive to their respective prevalences. The study additionally demonstrates applicability to Image Net-scale vision problems.
Researcher Affiliation Industry 1Seldon Technologies. Correspondence to: Oliver Cobb <oc@seldon.io>.
Pseudocode Yes Algorithm 1 Context-Aware Drift Detection
Open Source Code Yes Make an implementation available to use as part of the open-source Python library alibi-detect (Van Looveren et al., 2022).
Open Datasets Yes We use the Image Net (Deng et al., 2009) class structure developed by Santurkar et al. (2021) to represent realistic drifts in distributions underlying subpopulations... We use the Image Net training split to train a model M to predict superclasses from the undrifted samples. We then use the Image Net validation split for experiments.
Dataset Splits Yes We use the Image Net training split to train a model M to predict superclasses from the undrifted samples. We then use the Image Net validation split for experiments, assigning images x model predictions in M(x) [0, 1]6 as context c.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU model, CPU type, memory) used for running the experiments.
Software Dependencies No The paper mentions the "open-source Python library alibi-detect" but does not specify its version or the versions of other critical software dependencies (e.g., Python version, specific machine learning frameworks).
Experiment Setup Yes For kernels we use Gaussian RBFs with bandwidths set using the median heuristic (Gretton et al., 2012a). For MMD-ADi TT we fit eˆ(c) using kernel logistic regression. The portion of samples we hold out for this purpose is 25% across all experiments. Co Di TE estimates require a regularisation parameter λ to use as part of the estimation process, for which we use λ = 0.001 across all experiments. To associate resulting test statistics with estimates of p-values we use the conditional permutation test of Rosenbaum (1984). We do so using nperm = 100 conditional permutations.