Adversarial Counterfactual Learning and Evaluation for Recommender System

Authors: Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, Kannan Achan

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We carry out extensive simulation and real data experiments to demonstrate our performance, and deploy online experiments to fully illustrate the benefits of the proposed approach. In the simulation study, we generate the synthetic data using real-world explicit feedback dataset so that we have access to the oracle exposure mechanism. We then show that models trained by our approach achieve superior unbiased offline evaluation performances. In the real-world data analysis, we demonstrate that the models trained by our approach also achieve more improvements even using the standard offline evaluation. By conducting online experiments, we verify that our robust evaluation is more accurate than the standard offline evaluation when compared with the actual online evaluations.
Researcher Affiliation Industry Da Xu, Chuanwei Ruan Walmart Labs, Sunnyvale, CA 94086 {Da.Xu, Chuanwei.Ruan}@walmartlabs.com Evren Korpeoglu, Sushant Kumar, Kannan Achan Walmart Labs, Sunnyvale, CA 94086 {EKorpeoglu, SKumar4, KAchan}@walmartlabs.com
Pseudocode Yes Algorithm 1: Minimax optimization Input: Learning rates rθ, rψ, discounts dθ, dψ > 1; while loss not stabilized do θ = θ rθEbatch θℓ fθ, gψ ; ψ = ψ + rψEbatch ψℓ fθ, gψ ; rθ = rθ/dθ, rψ = rψ/dψ; end
Open Source Code No The paper states that 'All the detailed data processing, experiment setup, model configuration, parameter tuning, training procedure, validation, testing and sensitivity analysis are provided in the appendix,' but it does not explicitly mention that source code for the methodology is provided in the appendix or via a specific link.
Open Datasets Yes We use the explicit feedback data from Movie Lens-1M and Goodreads datasets. Other than using the Movie Lens-1M and Goodreads data in the implicit feedback setting, we further include the Last FM music recommendation (implicit feedback) dataset. All the data sources, processing steps and other detailed descriptions are provided in the appendix.
Dataset Splits Yes In particular, all but the last two user-item interactions are used for training, the second-to-last interaction is used for validation, and the last interaction is used for testing.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments (e.g., CPU, GPU models, memory, or cloud instance types).
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup No The paper states, 'All the detailed data processing, experiment setup, model configuration, parameter tuning, training procedure, validation, testing and sensitivity analysis are provided in the appendix.' While Algorithm 1 shows learning rates and discounts as parameters, specific numerical values for these or other hyperparameters are not provided in the main text.