Strategic Linear Contextual Bandits

Authors: Thomas Kleine Buening, Aadirupa Saha, Christos Dimitrakakis, Haifeng Xu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we support our theoretical findings with simulations of strategic gaming behavior in response to Opt GTM and Lin UCB (Section 6). ... Experimental Setup. We associate each arm with a true feature vector y i Rd1 (e.g., product features) and randomly sample a sequence of user vectors ct Rd2 (i.e., customer features).
Researcher Affiliation Collaboration Thomas Kleine Buening The Alan Turing Institute Aadirupa Saha Apple ML Research Christos Dimitrakakis University of Neuchatel Haifeng Xu University of Chicago
Pseudocode Yes Mechanism 1: The Greedy Grim Trigger Mechanism (GGTM) ... Mechanism 2: The Optimistic Grim Trigger Mechanism (Opt GTM)
Open Source Code No 5. Open access to data and code Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [NA] Justification:
Open Datasets No We associate each arm with a true feature vector y i Rd1 (e.g., product features) and randomly sample a sequence of user vectors ct Rd2 (i.e., customer features). The paper describes a simulation setup rather than referencing a publicly available dataset or providing access information for one.
Dataset Splits No The paper describes a simulated environment where strategic arms interact with the deployed algorithm over the course of 20 epochs, with each epoch consisting of T = 10k rounds. It does not specify training, validation, or test dataset splits.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or detailed computer specifications used for running its experiments.
Software Dependencies No The paper does not specify software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') for replicating the experiments.
Experiment Setup Yes Experimental Setup. We associate each arm with a true feature vector y i Rd1 (e.g., product features) and randomly sample a sequence of user vectors ct Rd2 (i.e., customer features). We use a feature mapping φ(ct, yi) = xt,i to map yi Rd1 and ct Rd2 to an arm-specific context xt,i Rd that the algorithm observes. At the end of every epoch, each arm then performs an approximated gradient step on yi w.r.t. its utility, i.e., the number of times it is selected. We let K = 5 and d = d1 = d2 = 5 and average the results over 10 runs.