Dynamic-Depth Context Tree Weighting
Authors: Joao V. Messias, Shimon Whiteson
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now present empirical results on byte-prediction tasks and partially-observable RL. Our code and instructions for its use is publicly available at: https://bitbucket.org/jmessias/vmm_py. |
| Researcher Affiliation | Collaboration | João V. Messias Morpheus Labs Oxford, UK jmessias@morpheuslabs.co.uk Shimon Whiteson University of Oxford Oxford, UK shimon.whiteson@cs.ox.ac.uk During the development of this work, the main author was employed by the University of Amsterdam. |
| Pseudocode | Yes | The complete D2-CTW algorithm operates as follows (please refer to Appendix A.3 for the respective pseudo-code) |
| Open Source Code | Yes | Our code and instructions for its use is publicly available at: https://bitbucket.org/jmessias/vmm_py. |
| Open Datasets | Yes | We compare the performance of D2-CTW against CTW on the 18-file variant of the Calgary Corpus [3], a benchmark of text and binary data files. |
| Dataset Splits | No | The paper does not provide specific details about training/validation/test dataset splits. For byte prediction, it describes a streaming process, and for RL, online interaction, which do not typically involve fixed splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., libraries, frameworks). |
| Experiment Setup | Yes | For CTW, we performed a grid search taking K {1, . . . , 10} for each file. For D2-CTW, we investigated the effect of γ on the prediction log-loss across different files, and found no significant effect of this parameter for sufficiently large values (an example is shown in Fig. 1f), in accordance with Theorem 1. Consequently, we set γ = 10 for all our D2-CTW runs. We also set L = and H = 2. |