Randomised Procedures for Initialising and Switching Actions in Policy Iteration
Authors: Shivaram Kalyanakrishnan, Neeldhara Misra, Aditya Gopalan
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | With the objective of furnishing improved upper bounds for PI, we introduce two randomised procedures in this paper. Our first contribution is a routine to find a good initial policy for PI. ... Our second contribution is a randomised action-switching rule for PI, which admits a bound of p2 lnpk 1qqn on the expected number of iterations. To the best of our knowledge, this is the tightest complexity bound known for PI when k ě 3. |
| Researcher Affiliation | Academia | Shivaram Kalyanakrishnan Indian Institute of Technology Bombay Mumbai 400076 India shivaram@cse.iitb.ac.in Neeldhara Misra Indian Institute of Technology Gandhinagar Gandhinagar 382355 India neeldhara.misra@gmail.com Aditya Gopalan Indian Institute of Science Bengaluru 560012 India aditya@ece.iisc.ernet.in |
| Pseudocode | Yes | Procedure Guess-and-Max(t) and Algorithm RSPI are provided in the paper. |
| Open Source Code | No | The paper does not contain any statement about making source code publicly available, nor does it provide links to any code repositories. |
| Open Datasets | No | As a theoretical paper, it does not describe using datasets for training or provide access information for any datasets. |
| Dataset Splits | No | As a theoretical paper, it does not involve data splits for validation. |
| Hardware Specification | No | As a theoretical paper, it does not involve empirical experiments and therefore does not mention any hardware specifications. |
| Software Dependencies | No | As a theoretical paper, it does not describe specific software dependencies with version numbers for experimental reproducibility. |
| Experiment Setup | No | As a theoretical paper, it does not describe an experimental setup with hyperparameters or system-level training settings. |