Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
COLA: Consistent Learning with Opponent-Learning Awareness
Authors: Timon Willi, Alistair Hp Letcher, Johannes Treutlein, Jakob Foerster
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, in Sections 5 and 6, we report our experimental setup and results, investigating COLA and HOLA and comparing COLA to LOLA and CGD in a range of games. |
| Researcher Affiliation | Academia | 1Department of Engineering Science, University of Oxford, United Kingdom 2Department of Computer Science, University of Toronto, Canada 3Vector Institute, Toronto, Canada. |
| Pseudocode | No | The paper describes methods in text but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement or link providing access to open-source code for the described methodology. |
| Open Datasets | No | The paper performs experiments on game environments described by loss functions, rather than using traditional publicly available datasets with specified access information. |
| Dataset Splits | No | The paper does not specify training, validation, or test dataset splits in the context of traditional datasets, as it conducts experiments in game environments. |
| Hardware Specification | No | A part of this work was done while Timon Willi and Jakob Foerster were at the Vector Institute, University of Toronto. They are grateful for the access to the Vector Institute s compute infrastructure. They are also grateful for the access to the Advanced Research Computing (ARC) infrastructure. |
| Software Dependencies | No | All code was implemented using Python. The code relies on the PyTorch library for autodifferentiability (Paszke et al., 2019). The optimizer used is Adam (Kingma & Ba, 2015). |
| Experiment Setup | Yes | For the polynomial games, COLA uses a neural network with 1 non-linear layer for both h1(θ1, θ2) and h2(θ1, θ2). The non-linearity is a Re LU function. The layer has 8 nodes. For training, we randomly sample pairs of parameters on a [-1, 1] parameter region. ... We use a batch size of 8. We found that training is improved with a learning rate scheduler. For the learning rate scheduling we use a γ of 0.9. We train the neural network for 120,000 steps. ... For the non-polynomial games, we deploy a neural network with 3 non-linear layers using Tanh activation functions. Each layer has 16 nodes. For this type of game, the parameter region is set to [-7, 7]... During training, we used a batch size of 64. |