Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Label Complexity of Mixed-Initiative Classifier Training

Authors: Jina Suh, Xiaojin Zhu, Saleema Amershi

ICML 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct Mechanical Turk human experiments on two stylistic classi๏ฌer training tasks to illustrate our approach.
Researcher Affiliation Collaboration Microsoft Research. Redmond, WA, USA University of Wisconsin Madison. Madison, WI, USA
Pseudocode Yes Algorithm 1 The Mixed-Initiative Mechanism
Open Source Code No The paper does not provide any concrete access information (e.g., a link or explicit statement of code release) for its source code.
Open Datasets No The paper describes tasks (1D Threshold Task, 1D Interval Task) and states that human experiments were conducted using Amazon Mechanical Turk, but it does not provide access information (link, citation with authors/year, or mention of standard benchmark) for the dataset collected or used.
Dataset Splits No The paper describes a human experiment setup including participant filtering, but it does not specify training, validation, or test dataset splits in the context of machine learning model training.
Hardware Specification No The paper does not provide any specific details about the hardware used to conduct the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes We designed a 2x3, between-subjects study where each experiment compared the three training paradigms (computer-initiated, human-initiated, mixed-initiative) and teacher education (no education, education by analogues). ... For the 1D threshold task, TD = 2 with the optimal teaching set {(x1 = 19000, y1 = 1), (x2 = 19001, y2 = 1)}, while active learning using binary search would require 14 queries. For the 1D interval task, TD = 4 with the optimal teaching set {(x1 = 1259, y1 = 1), (x2 = 1260, y2 = 1), (x3 = 1360, y3 = 1), (x4 = 1361, y4 = 1)}, while active learning requires 26 queries.