Adversarial Policies Beat Superhuman Go AIs
Authors: Tony Tong Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We attack the state-of-the-art Go-playing AI system Kata Go by training adversarial policies against it, achieving a >97% win rate against Kata Go running at superhuman settings. |
| Researcher Affiliation | Collaboration | 1MIT 2UC Berkeley 3FAR AI 4Mc Gill University; Mila. |
| Pseudocode | No | The paper describes algorithms textually and with mathematical notation, but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code within a distinct block. |
| Open Source Code | Yes | Our open-source implementation is available at Git Hub. |
| Open Datasets | Yes | These checkpoints can all be obtained from Wu (2022b). Wu, D. J. Kata Go networks for kata1, 2022b. URL https://katagotraining.org/networks/. |
| Dataset Splits | No | The paper describes training adversarial policies against Kata Go models and then evaluating their win rates. However, it does not explicitly state dataset splits (e.g., training, validation, test sets) for the data used to train the adversarial policies themselves. |
| Hardware Specification | Yes | To train our adversaries, we used A4000, A6000, A100 40GB, and A100 80GB GPUs. |
| Software Dependencies | No | Specifically, using the Tensor Flow version of Kata Go (before Kata Go switched to using Py Torch in Kata Go version 1.12) we look at conv1 for the input layer... (This mentions software but lacks specific version numbers for the libraries like TensorFlow or PyTorch used by the authors). |
| Experiment Setup | Yes | Table C.1. Key hyperparameter settings for our adversarial training runs. This table includes "Batch Size 256", "Learning Rate Scale of Hard-coded Schedule 1.0", "Adversary Visit Count 600". |