Self-Paced Multi-Task Learning

Authors: Changsheng Li, Junchi Yan, Fan Wei, Weishan Dong, Qingshan Liu, Hongyuan Zha

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the toy and real-world datasets demonstrate the effectiveness of the proposed approach, compared to the state-of-the-arts.
Researcher Affiliation Collaboration Changsheng Li12, Junchi Yan12 , Fan Wei,3 Weishan Dong,2 Qingshan Liu,4 Hongyuan Zha51 1East China Normal University 2IBM Research China 3Stanford University 4Nanjing University of Info. Science & Tech 5Georgia Institute of Technology
Pseudocode Yes Algorithm 1 Self-Paced Multi-Task Learning (SPMTL)
Open Source Code No The paper does not contain any statement or link indicating that source code for their methodology is provided or publicly available.
Open Datasets Yes OHSUMED (Hersh et al. 1994) and Isolet1. The first one is an ordinal regression dataset... The second dataset is collected from 150 speakers... 1http://www.cad.zju.edu.cn/home/dengcai/Data/MLData.html
Dataset Splits No The paper mentions "randomly select the training instances from each task with different training ratios (5%, 10% and 15%) and use the rest of instances to form the testing set," but does not explicitly mention a distinct validation set or its split.
Hardware Specification No The paper does not provide specific hardware details such as CPU/GPU models, memory, or specific computing environments used for running experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks).
Experiment Setup Yes The regularization parameter α in (3) is used to control the complexity of the basis tasks. We find α = 100 works well on all the three datasets, and thus fix it to 100 throughout the experiments. The parameter β is tuned in the space [0.001, 0.01, 0.1, 1, 10, 100]. The parameters λ and γ influence how many tasks will be selected for training. Thus we initially set more than 20% tasks selected in the experiment. To determine the corresponding λ and γ, we adopt the grid search strategy based on the principle that larger λ and smaller γ can make more weights to be larger. After initialization, we increase λ and decrease γ to gradually involve hard tasks and instances at each iteration.