Adaptive Pairwise Weights for Temporal Credit Assignment

Authors: Zeyu Zheng, Risto Vuorio, Richard Lewis, Satinder Singh9225-9232

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this empirical paper, we explore heuristics based on more general pairwise weightings...
Researcher Affiliation Academia 1University of Michigan 2University of Oxford
Pseudocode No The paper states 'An overview of the algorithm is in the appendix' but does not include any pseudocode or clearly labeled algorithm blocks in the provided text.
Open Source Code No The paper mentions 'We compare against TVT by using their published code', referring to a third-party's code, but does not provide an explicit statement or link for the open-source code for the methodology described in this paper.
Open Datasets Yes We evaluated Meta-PWTD and -PWR the Key-to-Door (Kt D) environment (Hung et al. 2019) that is an elaborate umbrella problem that was designed to show-off the TVT algorithm s ability to solve TCA. ... bsuite (Osband et al. 2019) and Atari (Bellemare et al. 2013), both standard RL benchmarks.
Dataset Splits No The paper mentions tuning hyperparameters and repeating runs with different random seeds but does not explicitly provide specific train/validation/test dataset splits (e.g., percentages or sample counts).
Hardware Specification No The paper does not explicitly describe any specific hardware specifications (e.g., GPU/CPU models, memory amounts) used for running the experiments.
Software Dependencies No The paper discusses various algorithms and environments but does not provide a reproducible description of ancillary software with specific version numbers (e.g., 'Python 3.8, PyTorch 1.9').
Experiment Setup Yes We tuned hyperparameters for each method on the mid-level configuration µ = 5, σ = 25 and kept them fixed for the other 8 configurations. Each method has a distinct set of parameters (e.g. outer-loop learning rates, λ). More details are in the appendix.