Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Position: Social Environment Design Should be Further Developed for AI-based Policy-Making
Authors: Edwin Zhang, Sadie Zhao, Tonghan Wang, Safwan Hossain, Henry Gasztowtt, Stephan Zheng, David C. Parkes, Milind Tambe, Yiling Chen
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We do not include experimental results here, as the primary purpose of this paper is to propose a future research agenda and illustrate open problems. |
| Researcher Affiliation | Collaboration | 1Harvard University 2Founding 3Oxford University 4Asari AI 5Google Research. |
| Pseudocode | No | The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | We release a core implementation of our framework as a Sequential Social Dilemma Environment along with code; and We release our code in the supplementary material for reproducibility. |
| Open Datasets | No | The paper describes a custom 'Apple Picking Game' environment built with Melting Pot 2.0, but does not provide concrete access information or formal citation for a specific publicly available dataset used for training in the traditional sense. |
| Dataset Splits | No | The paper describes training details for an RL environment but does not specify dataset splits (e.g., train/validation/test percentages or counts) for reproducibility. |
| Hardware Specification | No | The paper mentions training details and hyperparameters but does not provide specific hardware details (e.g., CPU, GPU models, memory, or cloud instances) used for running experiments. |
| Software Dependencies | No | The paper mentions using PPO and GAE algorithms but does not provide specific software dependencies or library version numbers (e.g., Python, PyTorch, TensorFlow versions) for reproducibility. |
| Experiment Setup | Yes | Here we give a detailed breakdown of several key hyperparameters and Training Details within our environment in section 4. Table 1. Hyperparameters for our methods in section 4. Example parameters include: Number of Agents 7, Initial Number of Apples 64, Tax Period 50, Episode Length 1000, Sampling Horizon 200. |