Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Mean-Field Games With Finitely Many Players: Independent Learning and Subjectivity
Authors: Bora Yongacoglu, Gürdal Arslan, Serdar Yüksel
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We ran Algorithm 3 on a 20-player mean-field game with compressed observability, described below in (12). This game can be interpreted as a model for decision-making during an epidemic or as a model for vehicle use decisions in a traffic network. ... Using the game in (12), we ran 250 independent trials of self-play under Algorithm 3, were each trial consisted of 20 exploration phases and each exploration phase consisted for 25,000 stage games. ... Our results are summarized in Figures 2 and 3 and in Table 1. In Figure 2, we plot the frequency of subjective ϵ-equilibrium against the exploration phase index. ... Table 1: Frequency of Subjective ϵ-equilibrium. πk,τ denotes the policy for EP k during the τ th trial. |
| Researcher Affiliation | Academia | Bora Yongacoglu EMAIL Department of Electrical and Computer Engineering University of Toronto; G urdal Arslan EMAIL Department of Electrical Engineering University of Hawaii at Manoa; Serdar Y uksel EMAIL Department of Mathematics and Statistics Queen s University |
| Pseudocode | Yes | Algorithm 1: Independent Learning of (Subjective) Value Functions; Algorithm 2: Subjective ϵ-satisficing Policy Revision (for player i N); Algorithm 3: Independent Learning |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code for the methodology described, nor does it include any links to code repositories. |
| Open Datasets | No | The paper describes a custom-designed simulation environment and its parameters in Section 6, 'Simulation Study', rather than using a pre-existing or publicly available dataset. There is no mention or reference to any publicly accessible dataset. |
| Dataset Splits | No | The paper describes running simulations, not experiments on a dataset with traditional splits. It mentions '250 independent trials of self-play' and 'each exploration phase consisted for 25,000 stage games' in the context of simulation runs, which is not equivalent to dataset splits for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or cloud computing specifications) used to run the simulations described in the 'Simulation Study' section. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers for implementation (e.g., Python, PyTorch, TensorFlow, specific solvers, etc.). |
| Experiment Setup | Yes | The game G used for our simulation is given by the following list: G = (N, X, Y, A, {ϕi}i N, Ploc, c, γ, ν0, ). ... N = {1, 2, . . . , 20} is a set of 20 players. ... X = {bad, medium, good}. ... A = {go, wait, heal}. ... The stage cost function c : X (X) A R is given by c(s, ν, a) := Rgo 1{a = go} + Rbad 1{s = bad} + Rheal 1{a = heal} for all (s, ν, a) X (X) A, where Rgo = 5 is a reward for undertaking one s usual business, Rbad = 10 is a penalty for being in bad condition, and Rheal = 3 is the cost of seeking healthcare. The discount factor γ = 0.8, and ν0 (X) is the product uniform distribution: xi 0 Unif(X) for each i N and the random variables {xi 0}i N are jointly independent. ... Our chosen parameters were ϵ = 5, di 1.5, and ei 0.25. |