Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Lending Interaction Wings to Recommender Systems with Conversational Agents
Authors: Jiarui Jin, Xianyu Chen, Fanghua Ye, Mengyue Yang, Yue Feng, Weinan Zhang, Yong Yu, Jun Wang
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on 8 industrial datasets show that CORE could be seamlessly employed on 9 popular recommendation approaches, and can consistently bring significant improvements, compared against either recently proposed reinforcement learning-based or classical statistical methods, in both hot-start and cold-start recommendation settings. |
| Researcher Affiliation | Academia | 1Shanghai Jiao Tong University, 2University College London |
| Pseudocode | Yes | Algorithm 1 CORE for Querying Items and Attributes |
| Open Source Code | No | The paper does not include an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We conduct experiments on 8 industrial datasets (including both tabular data, sequential behavioral data and graph-structured data) with 9 popular recommendation approaches (e.g., Deep FM [20], DIN [52]). Amazon dataset [8, 33] is a dataset collected by Amazon... Last FM dataset [9] is a dataset collected from Lastfm... Yelp dataset [12] is a dataset collected from Yelp... |
| Dataset Splits | No | The paper describes how sessions are constructed for evaluation and references a 'training set' in one table, but it does not provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) for reproducibility. |
| Hardware Specification | Yes | All the models are trained under the same hardware settings with 16-Core AMD Ryzen 9 5950X (2.194GHZ), 62.78GB RAM, NVIDIA Ge Force RTX 3080 cards. |
| Software Dependencies | No | The paper mentions following official implementations for recommendation approaches and includes a small Python snippet using 'openai' and 'os' libraries in an appendix, but it does not provide a list of specific software dependencies with version numbers for the main experimental setup. |
| Experiment Setup | Yes | The learning rate is decreased from the initial value 1 10 2 to 1 10 6 during the training process. The batch size is set as 100. The weight for the L2 regularization term is 4 10 5. The dropout rate is set as 0.5. The dimension of embedding vectors is set as 64. |