Active Learning with Partial Feedback
Authors: Peiyun Hu, Zachary C. Lipton, Anima Anandkumar, Deva Ramanan
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on Tiny Image Net demonstrate that our most effective method improves 26% (relative) in top-1 classification accuracy compared to i.i.d. baselines and standard active learners given 30% of the annotation budget that would be required (naively) to annotate the dataset. |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University 2California Institute of Technology 3Amazon AI |
| Pseudocode | Yes | Algorithm 1 Active Learning with Partial Feedback |
| Open Source Code | Yes | We implement all models in MXNet and have posted our code publicly1. 1Our implementations of ALPF learners are available at: https://github.com/peiyunh/alpf |
| Open Datasets | Yes | We evaluate ALPF algorithms on the CIFAR10, CIFAR100, and Tiny Image Net datasets |
| Dataset Splits | No | The paper mentions 'training sets' and evaluates on 'test-set accuracy' but does not explicitly provide details for a distinct 'validation' dataset split with percentages or counts, or how a validation set was used if it existed. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for experiments, such as GPU or CPU models, or detailed cloud/cluster specifications. |
| Software Dependencies | No | The paper mentions 'MXNet' as the implementation framework but does not specify its version number or any other software dependencies with their respective versions. |
| Experiment Setup | Yes | We initialize weights with the Xavier technique (Glorot and Bengio, 2010) and minimize our loss using the Adam (Kingma and Ba, 2014) optimizer, finding that it outperforms SGD significantly when learning from partial labels. We use the same learning rate of 0.001 for all experiments, first-order momentum decay (β1) of 0.9, and second-order momentum decay (β2) of 0.999. Finally, we train with mini-batches of 200 examples and perform standard data augmentation techniques including random cropping, resizing, and mirror-flipping. |