Text-Based Interactive Recommendation via Constraint-Augmented Reinforcement Learning
Authors: Ruiyi Zhang, Tong Yu, Yilin Shen, Hongxia Jin, Changyou Chen
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive empirical evaluations are performed on text-based interactive recommendation and constrained text generation tasks, demonstrating consistent performance improvement over existing approaches. |
| Researcher Affiliation | Collaboration | 1 Duke University, 2 Samsung Research America, 3 University at Buffalo |
| Pseudocode | Yes | Algorithm 1 Reward Constrained Recommendation |
| Open Source Code | No | The paper refers to third-party open-source libraries like 'OpenAI Baselines' (https://github.com/openai/baselines) and 'Stable Baselines' (https://github.com/hill-a/stable-baselines), but does not provide a link or statement about the availability of its own specific implementation code for the described methodology. |
| Open Datasets | Yes | Our approaches are evaluated on the UT-Zappos50K dataset [50, 51]. We use the Yelp review dataset [40] to validate the proposed methods. |
| Dataset Splits | Yes | We split the data as 444,000, 63,500, and 127,000 sentences in the training, validation and test sets, respectively. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory). |
| Software Dependencies | No | The paper mentions software components like 'Adam [25] as the optimizer', 'Res Net50 [21]', and 'LSTM [23]', but does not specify version numbers for any programming languages, libraries, or frameworks used (e.g., PyTorch, TensorFlow). |
| Experiment Setup | Yes | We set α = 0.5 and λmax = 1. We use Adam as the optimizer, where the initial learning is set as 0.001 with batch size of 64. All models are trained for 100,000 iterations (user sessions). |