Creating Training Sets via Weak Indirect Supervision
Authors: Jieyu Zhang, Bohan Wang, Xiangchen Song, Yujing Wang, Yaming Yang, Jing Bai, Alexander Ratner
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On both image and text classification tasks as well as an industrial advertising application, we demonstrate the advantages of PLRM by outperforming baselines by a margin of 2%-9%.7 EXPERIMENTS |
| Researcher Affiliation | Collaboration | 1Microsoft Research Asia 2University of Washington 3University of Science and Technology of China 4Carnegie Mellon University 5Snorkel AI, Inc. |
| Pseudocode | Yes | Algorithm 1 WIS |
| Open Source Code | No | Our code will be released upon the acceptance. |
| Open Datasets | Yes | We demonstrate the applicability and performance of our method on image classification tasks derived from ILSVRC2012 (Russakovsky et al., 2015) and text classification tasks derived from LSHTC-3 (Partalas et al., 2015). |
| Dataset Splits | No | We sample data belonging to unseen classes for our experiments and split them into train and test set. |
| Hardware Specification | Yes | All experiments ran on a machine with an Intel(R) Xeon(R) CPU E5-2678 v3 with a 512G memory and a Ge Force GTX 1080Ti-11GB GPU. |
| Software Dependencies | No | All the code was implemented in Python. We use the standard implementation of the logistic regression model from Python scikit-learn library5 and the Res Net model from torchvision library6. Version numbers for these software components are not specified. |
| Experiment Setup | Yes | For the training of PGMs, we set the learning rate to be 1/n where n is the number of training data. For training logistic regression model, we use the default parameters in scikit-learn library. For training Res Net model, we set batch size as 256 and use Adam optimizer with learning rate being 1e-3 and weight decay being 5e-5. |