Learning Mixtures of Markov Chains and MDPs
Authors: Chinmaya Kausik, Kevin Tan, Ambuj Tewari
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results support these guarantees, where we attain 96.6% average accuracy on a mixture of two MDPs in gridworld, outperforming the EM algorithm with random initialization (73.2% average accuracy). We also significantly outperform the EM algorithm on real data from the Last FM song dataset. |
| Researcher Affiliation | Academia | 1Department of Mathematics, University of Michigan, Ann Arbor, USA 2Department of Statistics, University of Michigan, Ann Arbor, USA. |
| Pseudocode | Yes | Algorithm 1 Subspace Estimation, Algorithm 2 Clustering, Algorithm 3 in Appendix D |
| Open Source Code | Yes | Code is available at https://github.com/hetankevin/mdpmix. |
| Open Datasets | Yes | For our experiments with real-life data, we work with the Last.fm 1K dataset (Celma, 2010b; Lamere, 2008; Celma, 2010a). |
| Dataset Splits | No | The paper describes dividing trajectories into Nsub and Nclust for subspace estimation and clustering, and Nclass for classification, but does not provide explicit training, validation, and test dataset splits in the conventional sense for model training and evaluation. |
| Hardware Specification | No | No specific hardware details (like GPU models, CPU models, or cloud instance types) used for running the experiments are mentioned in the paper. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | No | The paper describes the number of trajectories, the value of K, and general dataset characteristics. However, it does not provide specific hyperparameters such as learning rates, batch sizes, number of epochs, or optimizer settings for the algorithms used in the experiments. |