Learning Hyper Label Model for Programmatic Weak Supervision
Authors: Renzhi Wu, Shen-En Chen, Jieyu Zhang, Xu Chu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On 14 real-world datasets, our hyper label model outperforms the best existing methods in both accuracy (by 1.4 points on average) and efficiency (by six times on average). |
| Researcher Affiliation | Academia | Renzhi Wu1, Shen-En Chen1, Jieyu Zhang2, Xu Chu1 1Georgia Tech 2University of Washington |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. Figure 1 is a diagram illustrating the network architecture. |
| Open Source Code | Yes | Our code is available at https://github.com/wurenzhi/hyper label model |
| Open Datasets | Yes | Datasets. We use all 14 classification datasets in a recent weak supervision benchmark (Zhang et al., 2021) that are from diverse domains (e.g. income/sentiment/spam/relation/question/topic classification tasks). We highlight these datasets are only used for evaluation after our model is trained on synthetically generated data, and we never used these datasets during training. Table 1 shows the statistics of all datasets. We also use the metrics provided by the benchmark (Zhang et al., 2021) for each dataset (as different datasets need different metrics depending on their application background). All LFs are from the original authors of each dataset and are hosted in the benchmark project (Zhang, 2022a). |
| Dataset Splits | Yes | We use a synthetic validation set D to select the best run out of ten runs. The validation set is generated with a different method from a prior work (Zhang et al., 2021)... We synthetically generate the validation set D with size |D | = 100 according to the generation method proposed in (Zhang et al., 2021); |
| Hardware Specification | Yes | Hardware. All of our experiments were performed on a machine with a 2.20GHz Intel Xeon(R) Gold 5120 CPU, a K80 GPU and with 96GB 2666MHz RAM. |
| Software Dependencies | No | The paper mentions software like Pytorch (Paszke et al., 2019) and scikit-learn (rfs, 2022), but it does not provide specific version numbers for these or other key software components, which is required for reproducibility. |
| Experiment Setup | Yes | We use the Adam optimizer (Kingma & Ba, 2014). We set amsgrad to be true for better convergence (Reddi et al., 2019) and keep all other parameters as the default values provided by Pytorch (e.g. learning rate lr = 0.001). We use a batch size of 50... Table 7: Hyper-parameters and search space for the end models. (Provides batch size, lr, weight decay, ffn num layer, ffn hidden size for MLP and BERT) |