reproducibilityindex.ai

Consequentialist conditional cooperation in social dilemmas with imperfect information

Authors: Alexander Peysakhovich, Adam Lerer

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show how to construct such strategies using deep reinforcement learning techniques and demonstrate, both analytically and experimentally, that they are effective in social dilemmas beyond simple matrix games.
Researcher Affiliation	Industry	Alexander Peysakhovich & Adam Lerer Facebook AI Research New York, NY {alexpeys,alerer}@fb.com
Pseudocode	Yes	Algorithm 1 CCC as Agent 1
Open Source Code	No	The paper mentions using 'the pytorch-a3c package https://github.com/ ikostrikov/pytorch-a3c' but does not explicitly state that their own implementation code for the described methodology is open source or provided.
Open Datasets	No	The paper describes custom game environments like 'Fishery' and 'Pong Player’s Dilemma (PPD)', and a modified version of 'Coins' ('identical to Lerer & Peysakhovich (2017) but with the board size expanded to 8 8'). However, it does not provide concrete access information (links, repositories, or explicit public availability statements) for these specific game environments as datasets.
Dataset Splits	No	The paper describes training procedures for agents and mentions using 50 pairs of agents but does not provide specific train/validation/test dataset splits or percentages for any data used in experiments.
Hardware Specification	No	The paper mentions using '38 threads for A3C' but does not specify any particular CPU model, GPU model (e.g., NVIDIA A100), or other specific hardware components used for running the experiments.
Software Dependencies	No	The paper mentions using the 'pytorch-a3c package' but does not provide a specific version number for this or any other software dependency.
Experiment Setup	Yes	We use the default settings from pytorch-a3c: a discount rate of 0.99, learning rate of 0.0001, 20-step returns, and entropy regularization weight of 0.01.