Learning with Labeling Induced Abstentions
Authors: Kareem Amin, Giulia DeSalvo, Afshin Rostamizadeh
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a thorough set of experiments including an ablation study to test different components of our algorithm. We demonstrate the effectiveness of an efficient version of our algorithm over margin sampling on a variety of datasets. |
| Researcher Affiliation | Industry | Kareem Amin Google Research New York, NY kamin@google.com Giulia De Salvo Google Research New York, NY giuliad@google.com Afshin Rostamizadeh Google Research New York, NY rostami@google.com |
| Pseudocode | Yes | Algorithm 1 DPL-IWAL Algorithm |
| Open Source Code | No | The paper mentions using publicly available datasets and libraries like scikit-learn and LIBSVM tools, but does not provide a link or explicit statement about the public availability of the authors' own source code for DPL-IWAL or DPL-Simplified. |
| Open Datasets | Yes | We test six publicly available datasets [Chang and Lin] and for each, we use linear logistic regression models trained using the Python scikit-learn library. Chih-Chung Chang and Chih-Jen Lin. Libsvm. https://www.csie.ntu.edu.tw/~cjlin/ libsvmtools/datasets/. Accessed: 2021-05-28. |
| Dataset Splits | No | The paper mentions using a 'training split' and 'test split' for experiments ('a different random train/test split for each trial'), but it does not explicitly specify a separate 'validation' split or its size/proportion. |
| Hardware Specification | No | The paper describes the software libraries and datasets used for experiments but does not provide any specific details about the hardware specifications (e.g., GPU/CPU models, memory) on which these experiments were run. |
| Software Dependencies | No | The paper mentions using 'Python scikit-learn library' and 'scikit-learn s KNeighbors Classifier' and 'scikit-learn s Logistic Regression implementation', but it does not specify version numbers for these software components. |
| Experiment Setup | Yes | We execute a batch variant of DPL-Simplified, where at each iteration we process a batch of 5,000 examples, querying 20% of the examples for their labels and making prediction for the rest. All methods are seeded with 500 randomly sampled initial examples and each experiment is run for 10 trials. |