Improving Exploration in UCT Using Local Manifolds
Authors: Sriram Srinivasan, Erik Talvitie, Michael Bowling
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On domains inspired by video games, empirical evidence shows that our algorithm is more sample efficient than UCT, particularly when rewards are sparse.Figure 3 summarizes the results of our grid world experiments. |
| Researcher Affiliation | Academia | Sriram Srinivasan University of Alberta ssriram@ualberta.ca Erik Talvitie Franklin and Marshal College erik.talvitie@fandm.edu Michael Bowling University of Alberta mbowling@cs.ualberta.ca |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide any concrete access information for source code, such as repository links, explicit code release statements, or mention of code in supplementary materials. |
| Open Datasets | No | The paper describes custom "grid world domains" and "game domains" (Freeway, Space Invaders, Seaquest) which appear to be internally generated or inspired by, but not directly using, publicly accessible datasets with specific links or citations for access. For example: "Consider the grid world domain shown in Figure 1(a)." and "We also test our algorithm on three domains inspired by video games." |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., exact percentages, sample counts for train/validation/test sets) needed to reproduce the data partitioning. It discusses 'rollouts' for planning and 'validating intuitions' but not in the context of data splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. It only vaguely mentions 'computing resources provided by Compute Canada through Westgrid' in the acknowledgements. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. It mentions algorithms like UCT and Isomap but not their software implementations or versions. |
| Experiment Setup | Yes | A wide range {10 5, 10 4, . . . , 103, 104} was tested for the UCB parameter with 500 trials performed for each parameter setting.For the UCT algorithms with generalization, the Gaussian kernel function with an initial Gaussian width σ chosen from {10, 100, 1000, 10000} was used, and the decay rate β chosen from {0.1, 0.5, 0.9, 0.99}, and the best result is reported.The discount factor was set at 0.99. |