Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Adaptive Pairwise Weights for Temporal Credit Assignment
Authors: Zeyu Zheng, Risto Vuorio, Richard Lewis, Satinder Singh9225-9232
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this empirical paper, we explore heuristics based on more general pairwise weightings... |
| Researcher Affiliation | Academia | 1University of Michigan 2University of Oxford |
| Pseudocode | No | The paper states 'An overview of the algorithm is in the appendix' but does not include any pseudocode or clearly labeled algorithm blocks in the provided text. |
| Open Source Code | No | The paper mentions 'We compare against TVT by using their published code', referring to a third-party's code, but does not provide an explicit statement or link for the open-source code for the methodology described in this paper. |
| Open Datasets | Yes | We evaluated Meta-PWTD and -PWR the Key-to-Door (Kt D) environment (Hung et al. 2019) that is an elaborate umbrella problem that was designed to show-off the TVT algorithm s ability to solve TCA. ... bsuite (Osband et al. 2019) and Atari (Bellemare et al. 2013), both standard RL benchmarks. |
| Dataset Splits | No | The paper mentions tuning hyperparameters and repeating runs with different random seeds but does not explicitly provide specific train/validation/test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not explicitly describe any specific hardware specifications (e.g., GPU/CPU models, memory amounts) used for running the experiments. |
| Software Dependencies | No | The paper discusses various algorithms and environments but does not provide a reproducible description of ancillary software with specific version numbers (e.g., 'Python 3.8, PyTorch 1.9'). |
| Experiment Setup | Yes | We tuned hyperparameters for each method on the mid-level configuration µ = 5, σ = 25 and kept them fixed for the other 8 configurations. Each method has a distinct set of parameters (e.g. outer-loop learning rates, λ). More details are in the appendix. |