Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning from Delayed Feedback in Games via Extra Prediction

Authors: Yuma Fujimoto, Kenshi Abe, Kaito Ariu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The theoretical results are supported and strengthened by our experiments. Our experiments (Figs. 1-3) also support and reinforce these corollaries.
Researcher Affiliation Industry Yuma Fujimoto Cyber Agent EMAIL Kenshi Abe Cyber Agent EMAIL Kaito Ariu Cyber Agent EMAIL
Pseudocode Yes Algorithm 3 (Vanilla FTRL with time delay). When mt i = 0, generalized FTRL corresponds to vanilla FTRL. Algorithm 4 (Optimistic FTRL with time delay). When mt i = ut m i , generalized FTRL corresponds to optimistic FTRL (OFTRL). Algorithm 8 (Weighted Optimistic Follow the Regularized Leader). Weighted Optimistic Follow The Regularized Leader (WOFTRL) is given by mt i = nut m i for n N in generalized FTRL.
Open Source Code Yes The codes are available at https://github.com/CyberAgentAILab/delayed_learning_games
Open Datasets No The paper uses well-known theoretical game setups (Matching Pennies, Sato's Game, Rock-Paper-Scissors) for simulations, which do not involve external datasets in the traditional sense.
Dataset Splits No The paper conducts simulations of theoretical games (Matching Pennies, Sato's Game, Rock-Paper-Scissors) and therefore does not use external datasets requiring explicit train/test/validation splits.
Hardware Specification Yes Operating System: mac OS Monterey (version 12.4) Programming Language: Python 3.11.3 Processor: Apple M1 Pro (10 cores) Memory: 32 GB
Software Dependencies Yes Operating System: mac OS Monterey (version 12.4) Programming Language: Python 3.11.3
Experiment Setup Yes A. The phase diagram of social regrets for various time delays m (horizontal) and optimistic weights n (vertical). ... We set the parameters as T = 105 and η = 10 2. ... B. The convergence of social regrets for various optimistic weights n and a fixed time-delay m = 10 ... We set the parameters as η = 10 2. ... C. The scale of social regrets in the case of m = 10 and n = 11 ... We set η = 1/ T for the blue dots and η = 10 2 for the orange ones. ... We set the parameters as η = 10 1 and m = 4.