reproducibilityindex.ai

Spectral Learning of Predictive State Representations with Insufficient Statistics

Authors: Alex Kulesza, Nan Jiang, Satinder Singh

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on both synthetic and real-world problems. Using synthetic HMMs, we show that our method is robust to learning under a variety of transition topologies; compared to a baseline using the shortest tests and histories, our method achieves error rates up to an order of magnitude lower. We also demonstrate signiﬁcantly improved prediction results on a real-world language modeling task using a large collection of text from Wikipedia.
Researcher Affiliation	Academia	Computer Science & Engineering University of Michigan Ann Arbor, MI, USA
Pseudocode	Yes	Algorithm 1 Search for sets of k tests and histories that approximately minimize maxo σ1(Bo). Input: dataset D, initial T and H of size k, distributions p T /p H over candidate tests/histories, number of rounds r {Bo} := SPECTLEARN(D, T , H) σopt := maxo O σ1(Bo) for i = 1, . . . , r do Sample h H p H for h H do {Bo} := SPECTLEARN(D, T , H \ h {h}) σ(h ) := maxo O σ1(Bo) h = arg minh σ(h ) if σ(h ) < σopt then σopt := σ(h ) H := H \ h {h} [Repeat the same procedure for T ] Output: T ,H
Open Source Code	No	No explicit statement about open-source code for the described methodology or a link to a repository was found.
Open Datasets	Yes	Finally, we apply our algorithm to model a real-world text dataset of over 6.5 million sentences from Wikipedia articles (Sutskever, Martens, and Hinton 2011).
Dataset Splits	No	The majority of the data is used for training, but we reserve 100,000 sentences for evaluation.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) were provided for the experimental setup.
Software Dependencies	No	The paper does not provide specific software names with version numbers or other detailed software dependencies.
Experiment Setup	Yes	Our algorithm is initialized at the baseline T and H, and we sample new tests and histories whose length is one observation longer than the longest sequences in the baseline sets; the sampling probability of a sequence x is proportional to p2(x). We run our algorithm for 10 rounds.