Contextual Bandits with Cross-Learning

Authors: Santiago Balseiro, Negin Golrezaei, Mohammad Mahdian, Vahab Mirrokni, Jon Schneider

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We simulate our algorithms on real auction data from an ad exchange running first-price auctions (showing that they outperform traditional contextual bandit algorithms).
Researcher Affiliation Collaboration Santiago Balseiro Columbia Business School srb2155@columbia.edu Negin Golrezaei MIT Sloan School of Management golrezae@mit.edu Mohammad Mahdian Google Research mahdian@google.com Vahab Mirrokni Google Research mirrokni@google.com Jon Schneider Google Research jschnei@google.com
Pseudocode Yes Algorithm 1 O( KT log K) regret algorithm (UCB1.CL) for the contextual bandits problem with cross-learning where rewards are stochastic and contexts are adversarial. ... Algorithm 2 O( KT log K) regret algorithm (EXP3.CL) for the contextual bandits problem with simulated contexts.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to code repositories for the described methodology.
Open Datasets No We take existing first-price auction data from a large ad exchange... We collected anonymized data from 10 million consecutive auctions from this ad exchange... To remove outliers, bids and values above the 90% quantile were removed, and remaining bids/values were normalized to fit in the [0, 1] interval. The paper does not provide access to this data.
Dataset Splits Yes Parameters for each of these algorithms (including level of discretization of contexts for S-EXP3 and S-UCB1) were optimized via cross-validation on a separate data set of 10^5 auctions from the same ad exchange.
Hardware Specification No The paper does not specify any hardware used for the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes Allowable bids were discretized to multiples of 0.01. Parameters for each of these algorithms (including level of discretization of contexts for S-EXP3 and S-UCB1) were optimized via cross-validation on a separate data set of 10^5 auctions from the same ad exchange.