Curriculum Disentangled Recommendation with Noisy Multi-feedback
Authors: Hong Chen, Yudong Chen, Xin Wang, Ruobing Xie, Rui Wang, Feng Xia, Wenwu Zhu
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments on several real-world datasets demonstrate that the proposed CDR model can significantly outperform several state-of-the-art methods in terms of recommendation accuracy3. |
| Researcher Affiliation | Collaboration | 1Tsinghua University, 2We Chat Search Application Department, Tencent. |
| Pseudocode | Yes | Algorithm 1 Adjustable Self-evaluating Curriculum towards A Better Self |
| Open Source Code | Yes | Our code will be released at https://github.com/forchchch/CDR |
| Open Datasets | Yes | We conduct our experiments on four real-world datasets: We Chat5D, Movie Lens-1M[44], Amazon Sports[45] and Amazon Beauty[45]. |
| Dataset Splits | Yes | The whole dataset is chronologically divided to the train, valid, and test dataset by the ratio of 8:1:1. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Tensorflow' but does not provide specific version numbers for it or any other software libraries or dependencies. |
| Experiment Setup | Yes | We implement our method in Tensorflow and use the Adagrad [51] optimizer for mini-batch gradient descent that is suitable for sparse data, while the size of each mini-batch is 256. All the mentioned transformer encoders are four-head and one-layer. We cap the maximum sequential historical behavior length to 30 for all datasets. We fix µ in the curriculum to 10 and the other hyper-parameters are then tuned using random search. The search space is listed as follows. The number of latent intentions K {1, 2, , 8}. The prior confidence for the unclicked data λ {0.1, 0.2, , 1.0}. The learning rate {0.0001, 0.001, 0.01, 0.1, 1.0}. The hidden size of each field of feature {32, 64, 128, 256}. |