Iterative Teaching by Label Synthesis
Authors: Weiyang Liu, Zhen Liu, Hanchen Wang, Liam Paull, Bernhard Schölkopf, Adrian Weller
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we empirically demonstrate the value of our framework. and This section comprehensively evaluates LAST in the omniscient teaching scenario. Experimental details and more results (including BLAST) are given in Appendix G and Appendix F, respectively. |
| Researcher Affiliation | Academia | 1University of Cambridge 2MPI for Intelligent Systems, Tübingen 3Mila, Université de Montréal 4CIFAR AI Chair 5The Alan Turing Institute |
| Pseudocode | Yes | Algorithm 1 Omniscient Greedy LAST Initialize t=1, w0, ϵ and T; while wt w 2 ϵ or t T do Randomly select a sample xt from the pool; Solve Eq. (1) to synthesize the label yt: Use the synthesized label yt for the update: wt = wt−1 − ηt ℓ(xt, yt|wt−1) Set t ← t + 1; end |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | For real image data, we use 3/5 digits in MNIST. |
| Dataset Splits | Yes | For synthetic regression data, we generate 1000 data points (500 for training and 500 for testing)... For linear logistic regression, we generate 500 two-dimensional data points (300 for training and 200 for testing). and For the binary classification on MNIST (digit 3 and 5), we randomly select 500 samples (250 for digit 3 and 250 for digit 5) from the original MNIST training set as our training data, and 200 samples (100 for digit 3 and 100 for digit 5) from the original MNIST test set as our test data. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU/CPU models, memory, cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'All algorithms are implemented with Python and PyTorch' but does not specify version numbers for these software components or any other libraries. |
| Experiment Setup | Yes | The learning rate is selected from {0.001, 0.005, 0.01, 0.05, 0.1, 0.5} via grid search for all algorithms and the one that gives the best performance in terms of convergence is selected. The batch size is 1 for all algorithms. For MLP learners, we use a 2-layer MLP with 100 hidden neurons. The weights are initialized by Kaiming Initialization. |