Sample Efficient Learning of Predictors that Complement Humans
Authors: Mohammad-Amin Charusaie, Hussein Mozannar, David Sontag, Samira Samadi
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Code for our experiments is found in https://github. com/clinicalml/active_learn_to_defer. Dataset. We use the CIFAR-10 image classification dataset...In Figure 2, we plot the difference of accuracy between joint learning and staged learning...In Figure 3, we plot the of accuracy between joint learning and staged learning...In Figure 4, we plot corresponding errors of the Do D algorithm... |
| Researcher Affiliation | Academia | 1Max Planck Institute for Intelligent Systems, Tübingen, Germany. 2CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA. |
| Pseudocode | Yes | Algorithm 1: Active Learning algorithm Do D (Disagreement on disagreements) |
| Open Source Code | Yes | Code for our experiments is found in https://github. com/clinicalml/active_learn_to_defer. |
| Open Datasets | Yes | We use the CIFAR-10 image classification dataset (Krizhevsky et al., 2009) consisting of 32 2 32 color images drawn from 10 classes. |
| Dataset Splits | Yes | We use the CIFAR validation set of 10k images as the test set and split the CIFAR training set 90/10 for training and validation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only describes the general model architecture used. |
| Software Dependencies | No | The paper mentions using "Py Torch" but does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | We use the Adam W optimizer...with learning rate 0.001 and default parameters on Py Torch. We also use a cosine annealing learning rate scheduler and train for 100 epochs...For the surrogate Lα CE..., we perform a search for α over a grid [0, 0.1, 0.5, 1]. For the rejector model...we use the parameters [100, 100, 1000, 500]. |