Improving Predictive State Representations via Gradient Descent
Authors: Nan Jiang, Alex Kulesza, Satinder Singh
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first show on synthetic domains that our proposed gradient procedure can improve the model, and that spectral learning provides a useful initialization. We investigate the effectiveness of our gradient procedure on a character-level language modeling problem using Wikipedia data |
| Researcher Affiliation | Academia | Nan Jiang and Alex Kulesza and Satinder Singh nanjiang@umich.edu, kulesza@gmail.com, baveja@umich.edu Computer Science & Engineering University of Michigan |
| Pseudocode | Yes | Algorithm 1 Stochastic Gradient Descent with Contrastive Divergence for Predictive State Representations. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing their own source code or provide a link to a repository for it. |
| Open Datasets | Yes | We investigate the effectiveness of our gradient procedure on a character-level language modeling problem using Wikipedia data (Sutskever, Martens, and Hinton 2011) |
| Dataset Splits | No | The paper mentions training and testing datasets, but does not explicitly specify a validation dataset split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We use a constant learning rate of η = 10 6. To prevent the model parameters from experiencing sudden changes due to occasional stochastic gradients with a large magnitude, we rescale the stochastic gradient term Δ to guarantee that Δ 10. The learning rate and momentum parameters are set to 10 7 and 0.9, respectively. |