Promises and Pitfalls of Threshold-based Auto-labeling
Authors: Harit Vishwakarma, Heguang Lin, Frederic Sala, Ramya Korlakai Vinayak
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our theoretical guarantees with extensive experiments on synthetic and real datasets1. |
| Researcher Affiliation | Academia | Harit Vishwakarma hvishwakarma@cs.wisc.edu University of Wisconsin-Madison Heguang Lin hglin@seas.upenn.edu University of Pennsylvania Frederic Sala fredsala@cs.wisc.edu University of Wisconsin-Madison Ramya Korlakai Vinayak ramya@ece.wisc.edu University of Wisconsin-Madison |
| Pseudocode | Yes | Algorithm 1 Threshold-based Auto-Labeling (TBAL) Input: Unlabeled pool Xpool, auto labeling error threshold ϵa, seed data size ns, batch size for active query nb, labeled validation data pool Dval. Output: Dout = {(xi, yi) : xi Xpool} |
| Open Source Code | Yes | 1https://github.com/harit7/TBAL |
| Open Datasets | Yes | We use the following synthetic and real datasets. [...] a) Unit-Ball [...] b) Tiny-Image Net [...] c) IMDB Reviews [...] d) CIFAR-10 [...] MNIST [15] is a standard image dataset of hand-written digits. |
| Dataset Splits | Yes | We split the data into two sufficiently large pools. One is used as Xpool on which auto-labeling algorithms are run and the other is used as Xval from which the algorithms subsample validation data. [...] Unit-Ball [...] 16K are in Xpool and 4K are in Xval. [...] IMDB Reviews [...] standard train set of size 25K and split it into Xpool and Xval of sizes 20K and 5K respectively. [...] CIFAR-10 [...] standard training set into Xpool of size 40K and the validation pool of size 10K. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processors, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions software like 'sklearn' for SVMs and 'PyTorch' in tutorials, but it does not specify exact version numbers for these or other software dependencies. |
| Experiment Setup | Yes | To train a multi-layer perceptron (MLP) on the pre-computed embeddings of IMDB and Tiny-Image Net we use SGD with a learning rate of 0.05, 0.1 respectively, and batch size of 64. To train the medium CNN we use SGD with a learning rate of 10 2, batch size 256, and momentum of 0.9. More details on model training are in the Appendix. |