Collective Noise Contrastive Estimation for Policy Transfer Learning
Authors: Weinan Zhang, Ulrich Paquet, Katja Hofmann
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical study based on two real-world music usage data sources from Xbox Music shows excellent performance in terms of both data generation likelihood and expected policy value from successfully transferring knowledge from user generated playlists to their radio listening behaviour. |
| Researcher Affiliation | Collaboration | University College London, Microsoft Research |
| Pseudocode | No | The paper describes the methods and equations but does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the methodology described in this paper. A footnote links to supplementary material for derivation, not code. |
| Open Datasets | No | We test the proposed models on two proprietary datasets collected from Xbox Music, an online commercial music radio service. |
| Dataset Splits | Yes | Both for the playlist dataset and radio dataset, we randomly sample 10,000 transitions as the validation data and test data respectively, while the remainder is used as training data. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory details) used to run its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with versions) needed to replicate the experiment. |
| Experiment Setup | Yes | The latent feature vectors are all in 32 dimensions. We obtain the empirically optimal settings of radio-task weight α = 0.9, inter-domain regularisation term λ2 = 0.07, and noise-data ratio k = 100. Here NCE-X algorithms set λ1 = 0 in Eq. (16), while NCE-X-L algorithms sets λ1 = 0.05. |