reproducibilityindex.ai

Context-Aware Drift Detection

Authors: Oliver Cobb, Arnaud Van Looveren

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then provide an empirical study demonstrating its effectiveness for various drift detection problems of practical interest, such as detecting drift in the distributions underlying subpopulations of data in a manner that is insensitive to their respective prevalences. The study additionally demonstrates applicability to Image Net-scale vision problems.
Researcher Affiliation	Industry	1Seldon Technologies. Correspondence to: Oliver Cobb <oc@seldon.io>.
Pseudocode	Yes	Algorithm 1 Context-Aware Drift Detection
Open Source Code	Yes	Make an implementation available to use as part of the open-source Python library alibi-detect (Van Looveren et al., 2022).
Open Datasets	Yes	We use the Image Net (Deng et al., 2009) class structure developed by Santurkar et al. (2021) to represent realistic drifts in distributions underlying subpopulations... We use the Image Net training split to train a model M to predict superclasses from the undrifted samples. We then use the Image Net validation split for experiments.
Dataset Splits	Yes	We use the Image Net training split to train a model M to predict superclasses from the undrifted samples. We then use the Image Net validation split for experiments, assigning images x model predictions in M(x) [0, 1]6 as context c.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU model, CPU type, memory) used for running the experiments.
Software Dependencies	No	The paper mentions the "open-source Python library alibi-detect" but does not specify its version or the versions of other critical software dependencies (e.g., Python version, specific machine learning frameworks).
Experiment Setup	Yes	For kernels we use Gaussian RBFs with bandwidths set using the median heuristic (Gretton et al., 2012a). For MMD-ADi TT we fit eˆ(c) using kernel logistic regression. The portion of samples we hold out for this purpose is 25% across all experiments. Co Di TE estimates require a regularisation parameter λ to use as part of the estimation process, for which we use λ = 0.001 across all experiments. To associate resulting test statistics with estimates of p-values we use the conditional permutation test of Rosenbaum (1984). We do so using nperm = 100 conditional permutations.