Adversarial Learning for Feature Shift Detection and Correction

Authors: Míriam Barrabés, Daniel Mas Montserrat, Margarita Geleta, Xavier Giró-i-Nieto, Alexander Ioannidis

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide an in-depth experimental evaluation with multiple manipulation types and datasets. (Introduction, Contributions). 6 Experimental Results
Researcher Affiliation Collaboration Míriam Barrabés 1,2, Daniel Mas Montserrat 1, Margarita Geleta 3 Xavier Giró-i-Nieto 4, Alexander G. Ioannidis 1 1Stanford University 2Universitat Politècnica de Catalunya 3University of California, Berkeley 4Amazon
Pseudocode Yes Algorithm 1 DF-Locate (Page 4), Algorithm 2 DF-Correct (Page 5)
Open Source Code Yes The code is available at https://github.com/AI-sandbox/Data Fix. (Abstract)
Open Datasets Yes We use multiple datasets including UCI datasets such as Gas [76], Energy [77], and Musk2 [78], Open ML datasets including Scene [79], MNIST [80], and Dilbert [81], datasets with DNA sequences such as Founders [82] and a private Dog DNA dataset (Canine), a subset of phenotypes from the UK Biobank (Phenotypes) [83], Covid-19 data [84], and simple datasets including values generated from Cosine and Polynomial functions. (Section 6, Real world datasets)
Dataset Splits Yes We use 5-fold train-evaluation such that 80% of the samples are used to train a random forest binary classifier Dθ(x), and the 20% left is used to estimate the empirical divergence with Nx and Ny testing samples of the reference and query datasets, respectively. (Section 4, Shift Detection). We use the simulated datasets to perform hyperparameter search for Data Fix and all competing methods, while the real datasets are used as a hold-out testing set (Section 6, Experimental Details).
Hardware Specification Yes All experiments were done with an Intel Xeon Gold with 12 CPU cores. (Section H, Computational Time)
Software Dependencies No The paper mentions software components like 'random forest', 'gradient boosting trees', and 'Cat Boost [75]' but does not provide specific version numbers for these or other dependencies required for reproduction.
Experiment Setup Yes We use 5-fold train-evaluation such that 80% of the samples are used to train a random forest binary classifier Dθ(x)... (Section 4, Shift Detection). τ is a hyperparameter set by hyperparameter optimization... We use ϵ = 0.02 as the stopping threshold. (Section 4, DF-Locate). If the initial empirical divergence is already lower than ε = 0.1, the correction process is finalized. (Section 5, Initial Imputation). Typically, the number of epochs is set to 1 or 2 (Section 5, Iterative Process). Table 4: Search space and optimal values for tuned parameters in detection benchmarking methods. Table 5: Search space and optimal values for tuned parameters in correction benchmarking methods. (Section G.1, Hyperparameter Search).