Adversarial Label Learning
Authors: Chidubem Arachie, Bert Huang3183-3190
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test adversarial label learning on a variety of datasets, comparing it with other approaches for weak supervision. In this section, we describe how we simulate domain expertise to generate weak supervision signals. We then describe the datasets we evaluated with and the compared weak supervision approaches, and we analyze the results of the experiments. Table 1 shows the mean accuracies obtained by running ALL on the different datasets. |
| Researcher Affiliation | Academia | Chidubem Arachie Department of Computer Science Virginia Tech achid17@vt.edu Bert Huang Department of Computer Science Virginia Tech bhuang@vt.edu |
| Pseudocode | Yes | Algorithm 1 Adversarial Label Learning Require: Dataset X = [x1, . . . , xn], learning rate schedule α, weak signals and bounds [(q1, b1), . . . , (qm, bm)], augmented Lagrangian parameter ρ. 1: Initialize θ (e.g., random, zeros, etc.) 2: Initialize ˆy [0, 1]n (e.g., average of q1, . . . , qm) 3: Initialize γ Rm 0 (e.g., zeros) 4: while not converged do 5: Update θ with Equation (6) 6: Update p with model and θ 7: Update ˆy with Equation (7) 8: Update γ with Equation (8) 9: end while 10: return model parameters θ |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., URL to a repository) for open-source code related to the described methodology. |
| Open Datasets | Yes | Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017), Breast Cancer (Blake and Merz 1998; Street, Wolberg, and Mangasarian 1993), OBS Network (Rajab et al. 2016), Cardiotocography (Ayres-de Campos et al. 2000), Clave Direction (Vurkac 2011), Credit Card (Blake and Merz 1998), Statlog Satellite (Blake and Merz 1998), Phishing Websites (Mohammad, Thabtah, and Mc Cluskey 2012), Wine Quality (Cortez et al. 2009). |
| Dataset Splits | Yes | We randomly split each dataset such that 30% is used as weak supervision data, 40% is used as training data, and 30% is used as test data. For our experiments, we use 10 such random splits and report the mean of the results. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We use the sigmoid function as our parameterized function fθ for estimating class probabilities of ALL and GE, i.e., [fθ(xj)]n j=1 = 1/(1 + exp( θT x)) = pθ. We regularize this objective with an L2 penalty. The update step for the parameters is (1 2ˆy) , where p θ is the Jacobian matrix for the classifier f over the full dataset and αt is a gradient step size that can decrease over time. The update for each KKT multiplier is γi γi ρ q i (1 ˆy) + (1 qi) ˆy nbi , which is clipped to be non-negative and uses a fixed step size ρ as dictated by the augmented Lagrangian method (Hestenes 1969). |