Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Fair Cooperation in Mixed-Motive Games with Indirect Reciprocity
Authors: Martin Smit, Fernando P. Santos
IJCAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We consider two modelling approaches: evolutionary game theory, where we comprehensively search for social norms (i.e., rules to assign reputations) leading to cooperation and fairness; and RL, where we consider how the stochastic dynamics of policy learning affects the analytically identified equilibria.We run our RL experiments with a population of 50 agents (45 in the majority group and 5 in the minority group). We fix the exploration rate ยต and learning rate ฮฑ to 0.1. Each simulation runs for 250,000 interactions and we run each simulation 50 times with a different seed. |
| Researcher Affiliation | Academia | Martin Smit , Fernando P. Santos Informatics Institute, University of Amsterdam EMAIL |
| Pseudocode | No | The paper provides mathematical equations for Q-value updates but does not include structured pseudocode or an algorithm block. |
| Open Source Code | Yes | The source code for this paper (models, experiments, and figures) is available on Git Hub.1Appendix and code available at: www.github.com/sias-uva/ |
| Open Datasets | No | The paper describes a simulation environment with a 'well-mixed population of agents' and a 'donation game' rather than using a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper describes the population setup for its simulations but does not provide specific training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not mention any specific software dependencies with version numbers required to reproduce the experiment. |
| Experiment Setup | Yes | We fix the exploration rate ยต and learning rate ฮฑ to 0.1. Each simulation runs for 250,000 interactions and we run each simulation 50 times with a different seed. The rate of agent execution errors and judgement execution errors is relatively rare at 1%, and the benefit-to-cost ratio in our analytical model is 5 with c = 1, b = 5. Furthermore, the majority group comprises 90% of the population, and agents in different groups are functionally identical. |