Collective Noise Contrastive Estimation for Policy Transfer Learning

Authors: Weinan Zhang, Ulrich Paquet, Katja Hofmann

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical study based on two real-world music usage data sources from Xbox Music shows excellent performance in terms of both data generation likelihood and expected policy value from successfully transferring knowledge from user generated playlists to their radio listening behaviour.
Researcher Affiliation Collaboration University College London, Microsoft Research
Pseudocode No The paper describes the methods and equations but does not include a clearly labeled pseudocode or algorithm block.
Open Source Code No The paper does not provide an explicit statement or link for the open-source code of the methodology described in this paper. A footnote links to supplementary material for derivation, not code.
Open Datasets No We test the proposed models on two proprietary datasets collected from Xbox Music, an online commercial music radio service.
Dataset Splits Yes Both for the playlist dataset and radio dataset, we randomly sample 10,000 transitions as the validation data and test data respectively, while the remainder is used as training data.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory details) used to run its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with versions) needed to replicate the experiment.
Experiment Setup Yes The latent feature vectors are all in 32 dimensions. We obtain the empirically optimal settings of radio-task weight α = 0.9, inter-domain regularisation term λ2 = 0.07, and noise-data ratio k = 100. Here NCE-X algorithms set λ1 = 0 in Eq. (16), while NCE-X-L algorithms sets λ1 = 0.05.