reproducibilityindex.ai

Training Complex Models with Multi-Task Weak Supervision

Authors: Alexander Ratner, Braden Hancock, Jared Dunnmon, Frederic Sala, Shreyash Pandey, Christopher Ré4763-4771

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On three ﬁne-grained classiﬁcation problems, we show that our approach leads to average gains of 20.2 points in accuracy over a traditional supervised approach, 6.8 points over a majority vote baseline, and 4.1 points over a previously proposed weak supervision method that models tasks separately.
Researcher Affiliation	Academia	Department of Computer Science Stanford University {ajratner, bradenjh, jdunnmon, fredsala, shreyash, chrismre}@stanford.edu
Pseudocode	Yes	Algorithm 1 Source Accuracy Estimation for Multi-Task Weak Supervision
Open Source Code	Yes	To further validate this, we have released an open-source implementation of our framework.1 1github.com/Hazy Research/metal
Open Datasets	Yes	Named Entity Recognition (NER): ...over the Onto Notes dataset (Weischedel et al. 2011)... Relation Extraction (RE): ...in the TACRED dataset (Zhang et al. 2017b)... Medical Document Classiﬁcation (Doc): ...from the Open I dataset (National Institutes of Health 2017).
Dataset Splits	Yes	Each dataset consists of a large (3k-63k) amount of unlabeled training data and a small (200-350) amount of labeled data which we refer to as the development set, which we use for (a) a traditional supervision baseline, and (b) for hyperparameter tuning of the end model (see Appendix).
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper mentions 'PyTorch' as a library for implementation, but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup	No	The paper states 'Hyperparameters were selected with an initial search for each application (see Appendix), then ﬁxed,' deferring specific details to the Appendix which is not included in the provided text.