Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Confidence Weighted Multitask Learning
Authors: Peng Yang, Peilin Zhao, Jiayu Zhou, Xin Gao5636-5643
AAAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct the performance evaluation for the algorithms on three real-world datasets. We begin with the introduction of the experimental data and evaluation metrics. Then we show and discuss the empirical results. The experimental results are presented in Table 2. We also show the evaluation measures with respect to the round of online learning in Fig. 1. |
| Researcher Affiliation | Collaboration | Peng Yang,1 Peilin Zhao,2 Jiayu Zhou,3 Xin Gao1 1King Abdullah University of Science and Technology, Saudi Arabia 2Tencent AI Lab, China, 3Michigan State University, USA |
| Pseudocode | Yes | Algorithm 1 CWMT: Confidence Weighted Multitask Learning, Algorithm 2 ACWMT: Active Confidence Weighted Multitask Learning |
| Open Source Code | Yes | Proof. The proof is in the Supplementary Material1. 1https://github.com/Young Big Bird1985/Second-Order-Online-Multitask-Learning |
| Open Datasets | Yes | Spam Email2, maintained by Internet Content Filtering Group, collects 7068 emails from mailboxes of 4 users (i.e., 4 tasks)... 2http://labs-repos.iit.demokritos.gr/skel/i-config/ MHC-I3, a biomarker dataset, contains 18664 peptide sequences from 12 MHC-I molecules (i.e., 12 tasks)... 3http://web.cs.iastate.edu/ honavar/ailab/ Each Movie4 is a movie recommendation dataset... 4http://goldberg.berkeley.edu/jester-data/ |
| Dataset Splits | No | The paper describes an online learning setting where models are trained incrementally on data streams and does not specify traditional train/validation/test splits, percentages, or cross-validation setups. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., CPU, GPU models, or detailed cloud instance specifications) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, or specific library versions) for reproducibility. |
| Experiment Setup | Yes | For simplicity, we set ϵ = λ = 100 to avoid overfitting. In the query method, we set h = 0.1 for MHC-I and Each Movie, h = 1 for Spam Email. |