Learning Multiple Tasks in Parallel with a Shared Annotator

Authors: Haim Cohen, Koby Crammer

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical study with OCR data, vowel prediction (VJ project) and document classification, shows that our algorithm outperforms other algorithms, one of which uses uniform allocation, and essentially achieves more (accuracy) for the same labour of the annotator.
Researcher Affiliation Academia Haim Cohen Department of Electrical Engeneering The Technion Israel Institute of Technology Haifa, 32000 Israel hcohen@tx.technion.ac.il Koby Crammer Department of Electrical Engeneering The Technion Israel Institute of Technology Haifa, 32000 Israel koby@ee.technion.ac.il
Pseudocode Yes Figure 2: SHAMPO: SHared Annotator for Multiple Pr Oblems.
Open Source Code No The paper does not provide any explicit statements or links to open-source code for the described methodology.
Open Datasets Yes We evaluated the SHAMPO algorithm using four datasets: USPS, MNIST (both OCR), Vocal Joystick (VJ, vowel recognition) and document classification. The USPS dataset, contains 7, 291 training examples and 2, 007 test examples... The MNIST dataset with 28 28 gray-scale images, contains 60, 000 (10, 000) training (test) examples.
Dataset Splits No The paper specifies training and test set sizes but does not explicitly mention a separate validation set split or its size.
Hardware Specification No The paper does not provide any specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes We tried for SHAMPO 13 values for b, equally spaced on a logarithmic scale. All algorithms made a single pass over the training data. We ran two version of the algorithm: plain version, without aggressiveness (updates on mistakes only, λ = 0) and an Aggressive version λ = b/2 (we tried lower values of λ as in the bound, but we found that λ = b/2 gives best results), both with uniform prior (ai = 1).