Adversarial Counterfactual Learning and Evaluation for Recommender System
Authors: Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, Kannan Achan
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We carry out extensive simulation and real data experiments to demonstrate our performance, and deploy online experiments to fully illustrate the benefits of the proposed approach. In the simulation study, we generate the synthetic data using real-world explicit feedback dataset so that we have access to the oracle exposure mechanism. We then show that models trained by our approach achieve superior unbiased offline evaluation performances. In the real-world data analysis, we demonstrate that the models trained by our approach also achieve more improvements even using the standard offline evaluation. By conducting online experiments, we verify that our robust evaluation is more accurate than the standard offline evaluation when compared with the actual online evaluations. |
| Researcher Affiliation | Industry | Da Xu, Chuanwei Ruan Walmart Labs, Sunnyvale, CA 94086 {Da.Xu, Chuanwei.Ruan}@walmartlabs.com Evren Korpeoglu, Sushant Kumar, Kannan Achan Walmart Labs, Sunnyvale, CA 94086 {EKorpeoglu, SKumar4, KAchan}@walmartlabs.com |
| Pseudocode | Yes | Algorithm 1: Minimax optimization Input: Learning rates rθ, rψ, discounts dθ, dψ > 1; while loss not stabilized do θ = θ rθEbatch θℓ fθ, gψ ; ψ = ψ + rψEbatch ψℓ fθ, gψ ; rθ = rθ/dθ, rψ = rψ/dψ; end |
| Open Source Code | No | The paper states that 'All the detailed data processing, experiment setup, model configuration, parameter tuning, training procedure, validation, testing and sensitivity analysis are provided in the appendix,' but it does not explicitly mention that source code for the methodology is provided in the appendix or via a specific link. |
| Open Datasets | Yes | We use the explicit feedback data from Movie Lens-1M and Goodreads datasets. Other than using the Movie Lens-1M and Goodreads data in the implicit feedback setting, we further include the Last FM music recommendation (implicit feedback) dataset. All the data sources, processing steps and other detailed descriptions are provided in the appendix. |
| Dataset Splits | Yes | In particular, all but the last two user-item interactions are used for training, the second-to-last interaction is used for validation, and the last interaction is used for testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., CPU, GPU models, memory, or cloud instance types). |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | No | The paper states, 'All the detailed data processing, experiment setup, model configuration, parameter tuning, training procedure, validation, testing and sensitivity analysis are provided in the appendix.' While Algorithm 1 shows learning rates and discounts as parameters, specific numerical values for these or other hyperparameters are not provided in the main text. |