Time-Regularized Interrupting Options (TRIO)

Authors: Timothy Mann, Daniel Mankowitz, Shie Mannor

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that this approach can derive a good set of high-level skills even when the original set of skills cannot solve the problem.
Researcher Affiliation Academia Daniel J. Mankowitz DANIELM@TX.TECHNION.AC.IL Timothy A. Mann MANN@EE.TECHNION.AC.IL Shie Mannor SHIE@EE.TECHNION.AC.IL Electrical Engineering Department, The Technion Israel Institute of Technology, Haifa 32000, Israel
Pseudocode Yes Algorithm 1 Interrupting Option Value Iteration
Open Source Code No The paper does not provide any concrete access information (e.g., repository link, explicit statement of code release) for the source code of its methodology.
Open Datasets No The paper mentions using a 'gridworld (Sutton & Barto, 1998)' and discusses an 'inventory management domain (Scarf, 1959)' but does not provide concrete access information (link, DOI, formal citation with authors/year, or specific name of an established benchmark dataset) for any publicly available or open dataset used in its experiments.
Dataset Splits No The paper does not specify exact dataset split percentages or sample counts for training, validation, or test sets, nor does it reference predefined splits with citations.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper does not list any specific software components with version numbers (e.g., Python 3.8, PyTorch 1.9, CPLEX 12.4) that would be needed to replicate the experiment.
Experiment Setup Yes The resulting algorithm has two tunable parameters l and λ, where l controls the frequency at which the options are updated and λ [0, 1] controls the time-based regularization. We experimented with l = {1, 10, 20, 30, 40} and λ = {0, 0.1, 0.3, 0.5}, unless noted otherwise.