Mutli-Armed Bandits with Network Interference
Authors: Abhineet Agarwal, Anish Agarwal, Lorenzo Masoero, Justin Whitehouse
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we corroborate our theoretical findings via numerical simulations. [...] 6 Simulations |
| Researcher Affiliation | Collaboration | Abhineet Agarwal Department of Statistics UC Berkeley aa3797@berkeley.edu Anish Agarwal Department of IEOR Columbia University aa5194@columbia.edu Lorenzo Masoero Amazon masoerl@amazon.com Justin Whitehouse Computer Science Department Carnegie Mellon University jwhiteho@andrew.cmu.edu |
| Pseudocode | Yes | Algorithm 1 Network Explore-Then-Commit with Known Interference [...] Algorithm 2 Network Explore-Then-Commit with Unknown Interference |
| Open Source Code | Yes | Code for our methods and experiments can be found at https://github.com/aagarwal1996/Network MAB. |
| Open Datasets | No | The paper describes a 'Data Generating Process' for simulations but does not provide public access to the generated dataset. They create the data dynamically for their experiments. |
| Dataset Splits | Yes | For our Algorithms, we choose all hyper-parameters via 3-fold CV |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU/GPU models, memory). |
| Software Dependencies | No | The paper mentions using 'the scikit-learn implementation of the Lasso' but does not specify its version number or other software dependencies with versions. |
| Experiment Setup | Yes | Data Generating Process. We generate interference patterns with varying number of units N {5, . . . , 10}, and A = 2. For each N, we use s = 4. We generate rewards rn = θn, χ(a) , where the non-zero elements of θn (i.e., θn,S for S Bn) are drawn uniform from [0, 1]. We normalize rewards so that they are contained in [0, 1], and add 1 sub-gaussian noise to sampled rewards. [...] For our Algorithms, we choose all hyper-parameters via 3-fold CV, and use the scikit-learn implementation of the Lasso. [...] Algorithm 2 run with λ = 4 p E 1 log(2AN) + δ where E := (TAs)2/3 log Nδ + N log(A) 1/3 |