Active Learning from Peers

Authors: Keerthiram Murugesan, Jaime Carbonell

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments over three multitask learning benchmark datasets show clearly superior performance over baselines such as assuming task independence, learning only from the oracle and not learning from peer tasks.
Researcher Affiliation Academia Keerthiram Murugesan Jaime Carbonell School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 {kmuruges,jgc}@cs.cmu.edu
Pseudocode Yes Algorithm 1: Active Learning from Peers
Open Source Code No The paper does not provide an explicit statement or link for open-source code related to the described methodology.
Open Datasets Yes Landmine Detection3 consists of 19 tasks collected from different landmine fields. 3http://www.ee.duke.edu/~lcarin/Landmine Data.zip. Spam Detection4 We use the dataset obtained from ECML PAKDD 2006 Discovery challenge for the spam detection task. 4http://ecmlpkdd2006.org/challenge.html. Sentiment Analysis5 We evaluated our algorithm on product reviews from Amazon on a dataset containing reviews from 24 domains. 5http://www.cs.jhu.edu/~mdredze/datasets/sentiment
Dataset Splits Yes Unless otherwise specified, all model parameters are chosen via 5-fold cross validation.
Hardware Specification No The paper mentions 'CPU time' but does not specify any particular hardware components such as GPU or CPU models used for the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes We set the value of b1 = 1 for all the experiments and the value of b2 is tuned from 20 different values. Unless otherwise specified, all model parameters are chosen via 5-fold cross validation. In order to efficiently evaluate the proposed methods, we restrict the total number of label requests issued to the oracle during training, that is we give all the methods the same query budget: (10%, 20%, 30%) of the total number of examples T on sentiment dataset.