Cross-Task Knowledge Distillation in Multi-Task Recommendation

Authors: Chenxiao Yang, Junwei Pan, Xiaofeng Gao, Tingyu Jiang, Dapeng Liu, Guihai Chen4318-4326

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments are conducted to verify the effectiveness of our framework in real-world datasets. We conduct comprehensive experiments on a large-scale public dataset and a real-world production dataset that is collected from our platform. The results demonstrate that Cross Distil achieves state-of-the-art performance. The ablation studies also thoroughly dissect the effectiveness of its modules.
Researcher Affiliation Collaboration Chenxiao Yang1, Junwei Pan2, Xiaofeng Gao1, Tingyu Jiang2, Dapeng Liu2, Guihai Chen1 1 Department of Computer Science and Engineering, Shanghai Jiao Tong University 2 Tencent Inc.
Pseudocode Yes Algorithm 1: Training Algorithm for Cross Distil
Open Source Code No No explicit statement or link providing concrete access to the source code for the methodology described in this paper was found.
Open Datasets Yes We conduct experiments on a publicly accessible dataset Tik Tok and our We Chat dataset. Tiktok dataset is collected from a short-video app with two types of user feedback, i.e., Finish watching and Like . ... 2https://www.biendata.xyz/competition/icmechallenge2019/data/
Dataset Splits Yes For Tiktok, we randomly choose 80% samples as training set, 10% as validation set and the rest as test set. For We Chat, we split the data according to days and use the data of the first four days for training and the last day for validation and test.
Hardware Specification No No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments are mentioned in the paper.
Software Dependencies No No specific software dependencies with version numbers (e.g., "Python 3.8", "PyTorch 1.9") are provided in the paper.
Experiment Setup Yes RQ5: Hyper-parameter Study This subsection studies the performance variation of Cross Distil w.r.t. some key hyper-parameters (i.e. error correction margin m, auxiliary ranking loss coefficient β1 and β2, distillation loss weight α). Figure 3(a) shows the Multi-AUC performance with error correction margin ranges from 4 to 4. As we can see, the model performance first increases and then decreases. ... The results in Fig. 3(d) reveal that a proper α from 0 to 1 can bring the best performance, which is reasonable since the distillation loss plays the role of label smoothing regularization and could not replace hard labels.