Cross-Task Knowledge Distillation in Multi-Task Recommendation
Authors: Chenxiao Yang, Junwei Pan, Xiaofeng Gao, Tingyu Jiang, Dapeng Liu, Guihai Chen4318-4326
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments are conducted to verify the effectiveness of our framework in real-world datasets. We conduct comprehensive experiments on a large-scale public dataset and a real-world production dataset that is collected from our platform. The results demonstrate that Cross Distil achieves state-of-the-art performance. The ablation studies also thoroughly dissect the effectiveness of its modules. |
| Researcher Affiliation | Collaboration | Chenxiao Yang1, Junwei Pan2, Xiaofeng Gao1, Tingyu Jiang2, Dapeng Liu2, Guihai Chen1 1 Department of Computer Science and Engineering, Shanghai Jiao Tong University 2 Tencent Inc. |
| Pseudocode | Yes | Algorithm 1: Training Algorithm for Cross Distil |
| Open Source Code | No | No explicit statement or link providing concrete access to the source code for the methodology described in this paper was found. |
| Open Datasets | Yes | We conduct experiments on a publicly accessible dataset Tik Tok and our We Chat dataset. Tiktok dataset is collected from a short-video app with two types of user feedback, i.e., Finish watching and Like . ... 2https://www.biendata.xyz/competition/icmechallenge2019/data/ |
| Dataset Splits | Yes | For Tiktok, we randomly choose 80% samples as training set, 10% as validation set and the rest as test set. For We Chat, we split the data according to days and use the data of the first four days for training and the last day for validation and test. |
| Hardware Specification | No | No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments are mentioned in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., "Python 3.8", "PyTorch 1.9") are provided in the paper. |
| Experiment Setup | Yes | RQ5: Hyper-parameter Study This subsection studies the performance variation of Cross Distil w.r.t. some key hyper-parameters (i.e. error correction margin m, auxiliary ranking loss coefficient β1 and β2, distillation loss weight α). Figure 3(a) shows the Multi-AUC performance with error correction margin ranges from 4 to 4. As we can see, the model performance first increases and then decreases. ... The results in Fig. 3(d) reveal that a proper α from 0 to 1 can bring the best performance, which is reasonable since the distillation loss plays the role of label smoothing regularization and could not replace hard labels. |