reproducibilityindex.ai

Fast Offline Policy Optimization for Large Scale Recommendation

Authors: Otmane Sakhi, David Rohde, Alexandre Gilotte

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that our algorithm is an order of magnitude faster than naive approaches yet produces equally good policies.
Researcher Affiliation	Collaboration	1Criteo AI Lab, Paris, France 2CREST-ENSAE, IPP, Palaiseau, France {o.sakhi, d.rohde, a.gilotte}@criteo.com
Pseudocode	Yes	Algorithm 1: Fast Offline Policy Learning
Open Source Code	Yes	The source code1 to reproduce the results has all implementation details. 1https://github.com/criteo-research/fopo
Open Datasets	Yes	We chose two collaborative filtering datasets with large catalog sizes to validate our approach. the Twitch dataset (Rappaz, Mc Auley, and Aberer 2021) and the Good Reads user-books interaction dataset (Wan and Mc Auley 2018; Wan et al. 2019)
Dataset Splits	No	The paper mentions splitting the dataset into a train and test split, but does not explicitly detail a separate validation split or its proportions.
Hardware Specification	No	The paper mentions experiments were run on "CPU and GPU devices" and discusses speedups on "CPU machines" and with "GPU", but it does not specify any particular models (e.g., NVIDIA A100, Intel Xeon).
Software Dependencies	No	The paper mentions using "Pytorch", "Adam optimizer", "HNSW algorithm", and "FAISS library" but does not specify version numbers for any of these software components.
Experiment Setup	Yes	We opt for the Adam optimizer (Kingma and Ba 2014) with a batch size of 32 and a learning rate of 10 4 for all the experiments with the twitch dataset and 5.10 5 for the goodreads dataset. ... We plot the results of these runs on Figure 2. We observe that, even if REINFORCE has a much bigger time complexity per iteration (scales linearly on the catalog size), it does not outperform the optimization routines suggested by our approach. ... We fix K = 256 and S = 1000. ... embedding dimension set to L = 1000 ... The algorithms were run on the datasets considered for 50 epochs