Textual Membership Queries

Authors: Jonathan Zarecki, Shaul Markovitch

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We implement this framework in the textual domain and test it on several text classification tasks and show improved classifier performance as more MQs are labeled and incorporated into the training set.4 Empirical Evaluation We analyzed the performance of our framework on 5 publicly available sentence classification datasets.
Researcher Affiliation Academia Jonathan Zarecki and Shaul Markovitch Department of Computer Science, Technion Israel Institute of Technology szarecki@cs.technion.ac.il, shaulm@cs.technion.ac.il
Pseudocode Yes Algorithm 1: Stochastic query synthesis Algorithm 2: Search-based query synthesis
Open Source Code Yes The code for all experiments is available here2. 2www.github.com/jonzarecki/textual-mqs
Open Datasets Yes We report results on 5 binary sentence classification datasets: three sentiment analysis datasets, one sentence subjectivity dataset, and one hate-speech detection dataset. CMR: Cornell sentiment polarity dataset [Pang and Lee, 2005]. SST: Stanford sentiment treebank, a sentence sentiment analysis dataset [Socher et al., 2013]. KS: A Kaggle short sentence sentiment analysis dataset. 3 HS: Hate speech and offensive language classification dataset [Davidson et al., 2017]. SUBJ: Cornell sentence subjective / objective dataset [Pang and Lee, 2004].
Dataset Splits No The paper mentions evaluating against a test set and states cross-validation accuracy for an artificial expert, but does not specify the train/validation/test splits (e.g., percentages or counts) for the datasets used to train their own models.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models, or memory specifications.
Software Dependencies No The paper mentions software tools like Dependency Word2vec and Spacy's 'latest' part-of-speech parser, but it does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes We used a core set of 10 sentences, a pool size of 20, an AL batch size of 5, and the uncertainty samplingbased [Lewis and Gale, 1994] heuristic function as U for all experiments. All methods used an environment size of 10, and a linear classifier with an average 300-dim Glo Ve [Pennington et al., 2014] word-vectors as the learner.