Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Modeling the Economic Impacts of AI Openness Regulation

Authors: Tori Qiu, Benjamin Laufer, Jon Kleinberg, Hoda Heidari

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our model provides a theoretical foundation for AI governance decisions around openness and enables evaluation and refinement of practical open-source policies. Overall, we find the model s baseline performance determines when increasing the regulatory penalty vs. the open-source threshold will significantly alter the generalist s release strategy. Our model provides a theoretical foundation for AI governance decisions around openness and enables evaluation and refinement of practical open-source policies. G s no-regulation equilibrium strategies corroborate empirical patterns of model release, where models with lower performance are relatively more open access and closed models outpace open-weight models by several months in terms of performance (Figure 3).
Researcher Affiliation	Academia	Tori Qiu Carnegie Mellon University EMAIL Benjamin Laufer Cornell Tech EMAIL Jon Kleinberg Cornell University EMAIL Hoda Heidari Carnegie Mellon University EMAIL
Pseudocode	No	The paper describes the game setup and stages in textual format, for example, 'The full game with G s openness decision has the following stages: 1. G and D bargain over δ(ω, α1)... 2. G chooses an openness level ω [0, 1]... 3. Assuming G does not abstain, D chooses to adopt G s model and invests in improving it to performance α1 α0.', but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Answer: [Yes] Justification: Code for reproducing plots.
Open Datasets	Yes	Appendix F reports the figure data. ELO Score is the model s arena score from the Chatbot Arena LLM Leaderboard. % Closed is calculated the same as Figure 4b from Eiras et al. [14], using Table 3 s categorization of model components. MMLU Score uses exact match accuracy for MMLU All Subjects from the HELM leaderboard.
Dataset Splits	No	The paper does not involve machine learning experiments that typically use training/test/validation dataset splits. The data mentioned in Appendix F is for empirical corroboration of theoretical findings, not for training or evaluating models in the traditional sense.
Hardware Specification	Yes	Answer: [Yes] Justification: The experiments can be run on a single CPU (anywhere from 10 to 40 minutes for a simulation, depending on the number of parameters checked).
Software Dependencies	No	The paper mentions that code is provided for reproducing plots in the NeurIPS checklist, but it does not explicitly list specific software dependencies (e.g., libraries, frameworks) along with their version numbers.
Experiment Setup	Yes	Figure 4: Indifference curves for the generalist over (p, θ) choices for game parameters cω = 0.01, ϵ = 0.15 and α0 {0.5, 1, 5}. Figure 5: Player utilities under various (p, θ) regulations when α0 = 0.1, cω = 0.05, and ϵ = 0.1 with Nash bargaining. Figure 6: Equilibrium outcomes for α0 = 0.1, cω = 0.01, ϵ = 0.1 with Nash bargaining show Pareto improvement over utilities (ω, α1, UG, UD) for any (p, θ) regulation in the dotted region.