Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams
Authors: Erdem Biyik, Anusha Lalitha, Rajarshi Saha, Andrea Goldsmith, Dorsa Sadigh9296-9303
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We analytically show that our proposed strategy achieves logarithmic regret, and provide extensive experiments involving human-AI and human-robot collaboration to validate our theoretical findings. |
| Researcher Affiliation | Academia | 1 Department of Electrical Engineering, Stanford University 2 Department of Electrical and Computer Engineering, Princeton University 3 Department of Computer Science, Stanford University |
| Pseudocode | Yes | Algorithm 1: Partner-Aware UCB: Follower; Algorithm 2: Partner-Aware UCB: Leader |
| Open Source Code | Yes | Code at: https://sites.google.com/view/partner-aware-ucb |
| Open Datasets | No | The paper describes conducting simulations with fixed or random reward means and human-subject studies using an experimental setup (burger stacking, slot machines), but does not use a traditional publicly available or open dataset with access information. |
| Dataset Splits | No | The paper describes experimental runs and user studies, including warm-starting with simulated agents, but it does not specify traditional training/validation/test dataset splits needed for reproduction in a typical ML context. |
| Hardware Specification | No | The paper mentions using a 'Fetch robot (Wise et al. 2016)' for human-subject studies but does not provide specific hardware specifications (e.g., CPU, GPU, memory) of the computational resources used for simulations or training models, nor the robot's internal processing hardware. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments or simulations. |
| Experiment Setup | Yes | Unless otherwise noted |A1|=|A2|=2, p1 =1, p2 =0.5, c(L) =c(F) =0.025 in these simulations. ... For Partner Aware UCB, we set L = 1, W = 25. ... We set, when relevant, L = 1, W = 2 and c(L) = c(F) = 0.01. ... we set c(i) = 0.025, L = 1, and W = 25 for all agents. |