reproducibilityindex.ai

Adversarial Policies Beat Superhuman Go AIs

Authors: Tony Tong Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We attack the state-of-the-art Go-playing AI system Kata Go by training adversarial policies against it, achieving a >97% win rate against Kata Go running at superhuman settings.
Researcher Affiliation	Collaboration	1MIT 2UC Berkeley 3FAR AI 4Mc Gill University; Mila.
Pseudocode	No	The paper describes algorithms textually and with mathematical notation, but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code within a distinct block.
Open Source Code	Yes	Our open-source implementation is available at Git Hub.
Open Datasets	Yes	These checkpoints can all be obtained from Wu (2022b). Wu, D. J. Kata Go networks for kata1, 2022b. URL https://katagotraining.org/networks/.
Dataset Splits	No	The paper describes training adversarial policies against Kata Go models and then evaluating their win rates. However, it does not explicitly state dataset splits (e.g., training, validation, test sets) for the data used to train the adversarial policies themselves.
Hardware Specification	Yes	To train our adversaries, we used A4000, A6000, A100 40GB, and A100 80GB GPUs.
Software Dependencies	No	Specifically, using the Tensor Flow version of Kata Go (before Kata Go switched to using Py Torch in Kata Go version 1.12) we look at conv1 for the input layer... (This mentions software but lacks specific version numbers for the libraries like TensorFlow or PyTorch used by the authors).
Experiment Setup	Yes	Table C.1. Key hyperparameter settings for our adversarial training runs. This table includes "Batch Size 256", "Learning Rate Scale of Hard-coded Schedule 1.0", "Adversary Visit Count 600".