Measuring the Discrepancy between Conditional Distributions: Methods, Properties and Applications

Authors: Shujian Yu, Ammar Shaker, Francesco Alesiani, Jose Principe

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present three solid examples on machine learning applications to demonstrate the performance improvement in the state-of-the-art (SOTA) methodologies gained by our conditional divergence statistic. We evaluate the performance of our method against four SOTA error-based concept drift detectors (i.e., DDM [Gama et al., 2004], EDDM [Baena-Garcıa et al., 2006], HDDM [Fr ıas-Blanco et al., 2014], and PERM) on two realworld data streams, namely the Digits08 [Sethi and Kantardzic, 2017] and the Abrupt Insects [dos Reis et al., 2016].
Researcher Affiliation Collaboration 1NEC Labs Europe, 69115 Heidelberg, Germany 2University of Florida, Gainesville, FL 32611, USA
Pseudocode Yes Algorithm 1 Test the conditional distribution divergence (CDD) based on the matrix Bregman divergence
Open Source Code Yes Code of our statistic is available at https: //bit.ly/Bregman Correntropy.
Open Datasets Yes To this end, we select data from 29 tasks that are collected from various landmine fields3. 3http://www.ee.duke.edu/ lcarin/Landmine Data.zip. We replace the na ıve ℓ2 distance with our proposed statistic to reconstruct the initial k NN graph for CCMTL and test its performance on a real-world Parkinson s disease data set [Tsanas et al., 2009]. We evaluate the performance of our method against four SOTA error-based concept drift detectors... on two realworld data streams, namely the Digits08 [Sethi and Kantardzic, 2017] and the Abrupt Insects [dos Reis et al., 2016]. We perform feature selection on two benchmark data sets [Brown et al., 2012].
Dataset Splits No No explicit training/test/validation split percentages or counts that include a separate 'validation' set are provided. The paper mentions 'different train/test ratios' for some experiments and '10 fold cross-validation' for others, but no explicit validation split.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are provided.
Software Dependencies No No specific ancillary software details (e.g., library or solver names with version numbers) are provided. The paper mentions various methods and estimators, such as 'adaptive k NN estimator' and 'linear Support Vector Machine (SVM)', but without corresponding version numbers for implementation reproducibility.
Experiment Setup Yes We then use Algorithm 1 (P = 500, η = 0.1) to test if our statistic can distinguish these two data sets. Throughout this work, we determine kernel width σ with the Silverman s rule of thumb [Silverman, 1986]. We select 10 features and use the linear Support Vector Machine (SVM) as the baseline classifier.