An Expectation-Maximization Algorithm to Compute a Stochastic Factorization From Data
Authors: Andre M. S. Barreto, Rafael L. Beirigo, Joelle Pineau, Doina Precup
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we present computational experiments to illustrate the properties and usefulness of our algorithm. The first experiment is a proof of concept : we generated transition matrices P R100 100, with srk(P) = 20, and tried to recover them with EMSF using different values for m. We then compared EMSF s results with those obtained by directly estimating P via maximum likelihood (referred to in the plots as CNT, for counting ). |
| Researcher Affiliation | Academia | Andr e M. S. Barreto and Rafael L. Beirigo Laborat orio Nacional de Computac ao Cient ıfica Petr opolis, RJ, Brazil {amsb,rafaelb}@lncc.br Joelle Pineau and Doina Precup Mc Gill University Montreal, QC, Canada {jpineau,dprecup}@cs.mcgill.ca |
| Pseudocode | Yes | Algorithm 1 shows a step by step description of the proposed method, which we call EMSF. The pseudo-code uses two matrices to represent the forward and backward variables: α Rτ 1 m, where αti = ˆαi(t), and β Rτ 1 m, where βti = ˆβi(t). For each sequence zl 1:τl, EMSF first computes α using (5) and then computes β using (8). Then, α and β are multiplied element-wise, giving rise to matrix C Rτ 1 m. |
| Open Source Code | No | The paper does not provide any explicit statements about making source code available or links to a code repository. |
| Open Datasets | Yes | The game of blackjack was implemented exactly as described in Section 5.1 of Sutton and Barto s [1998] book. |
| Dataset Splits | No | The paper does not explicitly specify dataset splits for training, validation, and testing or reference a validation set. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., CPU, GPU models, memory). |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | The first experiment is a proof of concept : we generated transition matrices P R100 100, with srk(P) = 20, and tried to recover them with EMSF using different values for m. ... For each value of τ, c = 10 trajectories were generated per run. ... The game of blackjack was implemented exactly as described in Section 5.1 of Sutton and Barto s [1998] book. The experiments were carried out as follows. First, an exploration policy that selects actions uniformly at random played a number of hands of blackjack. |