Skip Context Tree Switching

Authors: Marc Bellemare, Joel Veness, Erik Talvitie

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a regretbased analysis of our approach, and empirically evaluate it on the Calgary corpus and a set of Atari 2600 screen prediction tasks.
Researcher Affiliation Collaboration Marc G. Bellemare BELLEMARE@GOOGLE.COM Joel Veness VENESS@GOOGLE.COM Google Deep Mind Erik Talvitie ERIK.TALVITIE@FANDM.EDU Franklin and Marshall College
Pseudocode No The paper describes the algorithm and its operations using text and mathematical equations, but it does not include a clearly labeled pseudocode block or algorithm figure.
Open Source Code Yes A reference implementation of Skip CTS is provided at: http://github.com/mgbellemare/Skip CTS.
Open Datasets Yes We ran Skip CTS (with D = 48, K = 1) and CTS (with D = 48) on the Calgary Corpus (Bell et al., 1989), an established compression benchmark composed of 14 different files. We also tested our algorithm on the task of video game screen prediction. We used the Arcade Learning Environment (Bellemare et al., 2013a), an interface that allows agents to interact with Atari 2600 games.
Dataset Splits No The paper mentions training on datasets but does not explicitly provide training/validation/test dataset splits with specific percentages or counts.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using specific estimators (e.g., KT estimator, Sparse Adaptive Dirichlet (SAD) estimator) but does not provide specific version numbers for any software, libraries, or frameworks used.
Experiment Setup Yes We ran Skip CTS (with D = 48, K = 1) and CTS (with D = 48) on the Calgary Corpus (Bell et al., 1989). We trained Skip CTS with K = 0 and 1 on 54 Atari 2600 games. Each experiment consisted of 10 trials, each lasting 100,000 time steps, where one time step corresponds to 4 emulated frames. Each trial was assigned a specific random seed which was used for all values for K. We report the average log-loss per frame over the last 4500 time steps. Throughout our trials actions were selected uniformly at random from each game s set of legal actions.