reproducibilityindex.ai

COLA: Consistent Learning with Opponent-Learning Awareness

Authors: Timon Willi, Alistair Hp Letcher, Johannes Treutlein, Jakob Foerster

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, in Sections 5 and 6, we report our experimental setup and results, investigating COLA and HOLA and comparing COLA to LOLA and CGD in a range of games.
Researcher Affiliation	Academia	1Department of Engineering Science, University of Oxford, United Kingdom 2Department of Computer Science, University of Toronto, Canada 3Vector Institute, Toronto, Canada.
Pseudocode	No	The paper describes methods in text but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement or link providing access to open-source code for the described methodology.
Open Datasets	No	The paper performs experiments on game environments described by loss functions, rather than using traditional publicly available datasets with specified access information.
Dataset Splits	No	The paper does not specify training, validation, or test dataset splits in the context of traditional datasets, as it conducts experiments in game environments.
Hardware Specification	No	A part of this work was done while Timon Willi and Jakob Foerster were at the Vector Institute, University of Toronto. They are grateful for the access to the Vector Institute s compute infrastructure. They are also grateful for the access to the Advanced Research Computing (ARC) infrastructure.
Software Dependencies	No	All code was implemented using Python. The code relies on the PyTorch library for autodifferentiability (Paszke et al., 2019). The optimizer used is Adam (Kingma & Ba, 2015).
Experiment Setup	Yes	For the polynomial games, COLA uses a neural network with 1 non-linear layer for both h1(θ1, θ2) and h2(θ1, θ2). The non-linearity is a Re LU function. The layer has 8 nodes. For training, we randomly sample pairs of parameters on a [-1, 1] parameter region. ... We use a batch size of 8. We found that training is improved with a learning rate scheduler. For the learning rate scheduling we use a γ of 0.9. We train the neural network for 120,000 steps. ... For the non-polynomial games, we deploy a neural network with 3 non-linear layers using Tanh activation functions. Each layer has 16 nodes. For this type of game, the parameter region is set to [-7, 7]... During training, we used a batch size of 64.