Improving Screening Processes via Calibrated Subset Selection

Authors: Lequn Wang, Thorsten Joachims, Manuel Gomez Rodriguez

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on US Census survey data validate our theoretical results and show that the shortlists provided by our algorithm are superior to those provided by several competitive baselines.
Researcher Affiliation Academia 1Department of Computer Science, Cornell University 2Most of the work was done during Wang s internship at the Max Planck Institute for Software Systems. 3Max Planck Institute for Software Systems.
Pseudocode Yes Algorithm 1 Calibrated Subset Selection (CSS)
Open Source Code Yes Our code is accessible at https://github.com/Lequn Wang/Improve-Screening-via Calibrated-Subset-Selection.
Open Datasets Yes We create a simulated screening process using a dataset comprised of employment information for 3.2 million individuals from the US Census (Ding et al., 2021).
Dataset Splits No The paper mentions using a 'training set' and 'calibration sets' and a 'test pool of candidates'. However, it does not explicitly describe a separate 'validation' split (e.g., for hyperparameter tuning) with specific percentages or counts.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications).
Software Dependencies No The paper mentions training a 'logistic regression' classifier but does not specify any software names with version numbers (e.g., Python, scikit-learn, PyTorch, TensorFlow versions) that were used.
Experiment Setup Yes In each simulated screening process, we set the size of the test pool of candidates to m = 100, the desired expected number of qualified candidates to k = 5, and the success probability to 1 α = 0.9. For the diversity experiments, we set the desired expected number of qualified candidates kmaj and kmin so that the equal opportunity constraint... is satisfied subject to kmaj + kmin = 5.