Learning to Play Sequential Games versus Unknown Opponents

Authors: Pier Giuseppe Sessa, Ilija Bogunovic, Maryam Kamgarpour, Andreas Krause

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we experimentally validate the performance of our algorithms in traffic routing and wildlife conservation tasks, where they consistently outperform other baselines. In this section, we evaluate the proposed algorithms in traffic routing and wildlife conservation tasks.
Researcher Affiliation Academia Pier Giuseppe Sessa ETH Zürich sessap@ethz.ch Ilija Bogunovic ETH Zürich ilijab@ethz.ch Maryam Kamgarpour ETH Zürich maryamk@ethz.ch Andreas Krause ETH Zürich krausea@ethz.ch
Pseudocode Yes Algorithm 1 The STACKELUCB algorithm (Playing vs. Sequence of Unknown Opponents)
Open Source Code No The paper does not provide an explicit statement or link to the open-source code for the described methodology.
Open Datasets Yes We use the road traffic network of Sioux-Falls [21], which can be represented as a directed graph with 24 nodes and 76 edges e 2 E. Network s data and congestion model are based on [21].
Dataset Splits No The paper uses specific datasets (Sioux-Falls network, wildlife conservation game model) but does not provide explicit details about training, validation, or test data splits (e.g., percentages or sample counts).
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup Yes We run STACKELUCB with polynomial kernels of degree 3 or 4 (polynomial functions are typically used as good congestion models, cf., [21]), set according to Theorem 1 and use βt = 0.5 (we also observed, as in [35], that theory-informed values for βt are overly conservative). Kernel hyperparameters are computed offline via maximum-likelihood over 100 randomly generated points.