Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Sample Efficient Learning of Predictors that Complement Humans
Authors: Mohammad-Amin Charusaie, Hussein Mozannar, David Sontag, Samira Samadi
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Code for our experiments is found in https://github. com/clinicalml/active_learn_to_defer. Dataset. We use the CIFAR-10 image classification dataset...In Figure 2, we plot the difference of accuracy between joint learning and staged learning...In Figure 3, we plot the of accuracy between joint learning and staged learning...In Figure 4, we plot corresponding errors of the Do D algorithm... |
| Researcher Affiliation | Academia | 1Max Planck Institute for Intelligent Systems, Tübingen, Germany. 2CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA. |
| Pseudocode | Yes | Algorithm 1: Active Learning algorithm Do D (Disagreement on disagreements) |
| Open Source Code | Yes | Code for our experiments is found in https://github. com/clinicalml/active_learn_to_defer. |
| Open Datasets | Yes | We use the CIFAR-10 image classification dataset (Krizhevsky et al., 2009) consisting of 32 2 32 color images drawn from 10 classes. |
| Dataset Splits | Yes | We use the CIFAR validation set of 10k images as the test set and split the CIFAR training set 90/10 for training and validation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only describes the general model architecture used. |
| Software Dependencies | No | The paper mentions using "Py Torch" but does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | We use the Adam W optimizer...with learning rate 0.001 and default parameters on Py Torch. We also use a cosine annealing learning rate scheduler and train for 100 epochs...For the surrogate Lα CE..., we perform a search for α over a grid [0, 0.1, 0.5, 1]. For the rejector model...we use the parameters [100, 100, 1000, 500]. |