Representation Discovery for MDPs Using Bisimulation Metrics
Authors: Sherry Ruan, Gheorghe Comanici, Prakash Panangaden, Doina Precup
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the improvement by computing bisimulation metrics over a series of MDPs (the well-known Puddle World problem)... The bottom panel of Figure 1 shows the runtime of computing bisimulation metrics over partitions for up to 14 iterations of Algorithm 1. These empirical results illustrate that the computational complexity of computing bisimulation-based representations and corresponding metrics is mostly dependent on the intrinsic complexity of the reward function and transition models. To assess the importance of the asynchronous partition and metric update, we fixed the size of the Puddle World and compared the value function approximation error as a function of the size of intermediate representations. ...As can be seen in Figure 2, the asynchronous algorithm reaches representations of higher quality at much earlier stages of the iterative framework. This empirical result illustrates the use of heuristics can substantially speed up the computation. |
| Researcher Affiliation | Academia | Sherry Shanshan Ruan, Gheorghe Comanici, Prakash Panangaden, and Doina Precup School of Computer Science Mc Gill University, Montreal, QC, Canada {sherry, gcoman, prakash, dprecup}@cs.mcgill.ca |
| Pseudocode | Yes | Algorithm 1 Partition declustering |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the release of source code for the described methodology. |
| Open Datasets | No | The paper mentions 'the well-known Puddle World problem' but does not provide concrete access information (specific link, DOI, repository name, or a formal citation with authors/year) for the dataset used in their experiments. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or citations to predefined splits) needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | No | The paper does not contain specific experimental setup details such as concrete hyperparameter values, detailed training configurations, or system-level settings in the main text. |