Contextual Bandits with Cross-Learning
Authors: Santiago Balseiro, Negin Golrezaei, Mohammad Mahdian, Vahab Mirrokni, Jon Schneider
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We simulate our algorithms on real auction data from an ad exchange running first-price auctions (showing that they outperform traditional contextual bandit algorithms). |
| Researcher Affiliation | Collaboration | Santiago Balseiro Columbia Business School srb2155@columbia.edu Negin Golrezaei MIT Sloan School of Management golrezae@mit.edu Mohammad Mahdian Google Research mahdian@google.com Vahab Mirrokni Google Research mirrokni@google.com Jon Schneider Google Research jschnei@google.com |
| Pseudocode | Yes | Algorithm 1 O( KT log K) regret algorithm (UCB1.CL) for the contextual bandits problem with cross-learning where rewards are stochastic and contexts are adversarial. ... Algorithm 2 O( KT log K) regret algorithm (EXP3.CL) for the contextual bandits problem with simulated contexts. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to code repositories for the described methodology. |
| Open Datasets | No | We take existing first-price auction data from a large ad exchange... We collected anonymized data from 10 million consecutive auctions from this ad exchange... To remove outliers, bids and values above the 90% quantile were removed, and remaining bids/values were normalized to fit in the [0, 1] interval. The paper does not provide access to this data. |
| Dataset Splits | Yes | Parameters for each of these algorithms (including level of discretization of contexts for S-EXP3 and S-UCB1) were optimized via cross-validation on a separate data set of 10^5 auctions from the same ad exchange. |
| Hardware Specification | No | The paper does not specify any hardware used for the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Allowable bids were discretized to multiples of 0.01. Parameters for each of these algorithms (including level of discretization of contexts for S-EXP3 and S-UCB1) were optimized via cross-validation on a separate data set of 10^5 auctions from the same ad exchange. |