Learning Mixtures of Markov Chains and MDPs

Authors: Chinmaya Kausik, Kevin Tan, Ambuj Tewari

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results support these guarantees, where we attain 96.6% average accuracy on a mixture of two MDPs in gridworld, outperforming the EM algorithm with random initialization (73.2% average accuracy). We also significantly outperform the EM algorithm on real data from the Last FM song dataset.
Researcher Affiliation Academia 1Department of Mathematics, University of Michigan, Ann Arbor, USA 2Department of Statistics, University of Michigan, Ann Arbor, USA.
Pseudocode Yes Algorithm 1 Subspace Estimation, Algorithm 2 Clustering, Algorithm 3 in Appendix D
Open Source Code Yes Code is available at https://github.com/hetankevin/mdpmix.
Open Datasets Yes For our experiments with real-life data, we work with the Last.fm 1K dataset (Celma, 2010b; Lamere, 2008; Celma, 2010a).
Dataset Splits No The paper describes dividing trajectories into Nsub and Nclust for subspace estimation and clustering, and Nclass for classification, but does not provide explicit training, validation, and test dataset splits in the conventional sense for model training and evaluation.
Hardware Specification No No specific hardware details (like GPU models, CPU models, or cloud instance types) used for running the experiments are mentioned in the paper.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup No The paper describes the number of trajectories, the value of K, and general dataset characteristics. However, it does not provide specific hyperparameters such as learning rates, batch sizes, number of epochs, or optimizer settings for the algorithms used in the experiments.