Robust Classification Under Sample Selection Bias

Authors: Anqi Liu, Brian Ziebart

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our robust classification approach on synthetic and UCI binary classification datasets [6] to compare its performance against sample reweighted approaches for learning under sample selection bias. We empirically compare the predictive performance of the three approaches. We consider four classification datasets, selected from the UCI repository [6] based on the criteria that each contains roughly 1,000 or more examples, has discretely-valued inputs, and has minimal missing values. We generate biased subsets of these classification datasets to use as source samples and unbiased subsets to use as target samples.
Researcher Affiliation Academia Anqi Liu Department of Computer Science University of Illinois at Chicago Chicago, IL 60607 aliu33@uic.edu Brian D. Ziebart Department of Computer Science University of Illinois at Chicago Chicago, IL 60607 bziebart@uic.edu
Pseudocode Yes Algorithm 1 Batch gradient for robust bias-aware classifier learning.
Open Source Code No The paper does not provide open-source code for the methodology described. It mentions using the CVX package, which is a third-party tool, but not their own implementation code.
Open Datasets Yes We consider four classification datasets, selected from the UCI repository [6]...[6] Kevin Bache and Moshe Lichman. UCI machine learning repository, 2013.
Dataset Splits No The paper describes generating 'source samples' for training and 'target samples' for evaluation, and averages results over 50 random samples. However, it does not specify explicit train/validation/test splits with percentages or counts, nor does it explicitly mention a dedicated validation set for hyperparameter tuning.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned in the paper.
Software Dependencies Yes We employ the CVX package [25] to estimate parameters of the first two approaches...[25] Michael Grant and Stephen Boyd. CVX: Matlab software for disciplined convex programming, version 2.1. http://cvxr.com/cvx, March 2014.
Experiment Setup No While the paper discusses regularization and mentions trying a "range of 2-regularization weights (Appendix C)", it does not provide specific hyperparameter values (e.g., chosen regularization weights, learning rate, convergence threshold) for the final experiments in the main text.