Distributional Off-Policy Evaluation for Slate Recommendations

Authors: Shreyas Chaudhari, David Arbour, Georgios Theocharous, Nikos Vlassis

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the efficacy of our method empirically on synthetic data as well as on a slate recommendation simulator constructed from real-world data (Movie Lens-20M). Our results show a significant reduction in estimation variance and improved sample efficiency over prior work across a range of slate structures.
Researcher Affiliation Collaboration Shreyas Chaudhari1 David Arbour2, Georgios Theocharous2, Nikos Vlassis2 1University of Massachusetts Amherst 2Adobe Research schaudhari@cs.umass.edu, {arbour,theochar,vlassis}@adobe.com
Pseudocode Yes Algorithm 1: SUn O( )
Open Source Code Yes The code is available at: https://github.com/shreyasc-13/suno.
Open Datasets Yes We test our estimator on a publicly available dataset Movie Lens-20M (Harper and Konstan 2015) and on a semi-synthetic slate simulator Open Bandit Pipeline (Saito et al. 2020).
Dataset Splits No The paper uses an "offline dataset" for evaluation and discusses "different logged data sizes" and averaging over trials, but it does not specify explicit train/validation/test splits with percentages or sample counts for data partitioning.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes For these experiments, we set the number of slots K = 3 and the number of actions in each slot to N = 3. ...Here N = 20, K = 5, = 0.01 and results are averaged over 50 trials. ...We set K = 3, N = 10, and the results are averaged over 10 trials.