Reified Context Models
Authors: Jacob Steinhardt, Percy Liang
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that our approach obtains expressivity and coverage on three sequence modeling tasks. |
| Researcher Affiliation | Academia | Jacob Steinhardt JSTEINHARDT@CS.STANFORD.EDU Percy Liang PLIANG@CS.STANFORD.EDU Stanford University, 353 Serra Street, Stanford, CA 94305 USA |
| Pseudocode | No | The paper describes the RCMS procedure in text but does not provide pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | The code, data, and the experiments for this paper are available on Coda Lab at https://www.codalab.org/worksheets/ 0x8967960a7c644492974871ee60198401/. Finally, to showcase the ease of implementation of our method, we provide implementation details and runtime comparisons in the supplementary material, as well as runnable source code in our Coda Lab worksheet. |
| Open Datasets | Yes | Handwriting recognition. The first task is the handwriting recognition task from Kassel (1995); we use the clean version of the dataset from Weiss & Taskar (2010). Speech recognition (decoding). Our second task is from the Switchboard speech transcription project (Greenberg et al., 1996). Decipherment. We created a dataset from the English Gigaword corpus (Graff & Cieri, 2003). |
| Dataset Splits | No | The paper mentions splits for training and testing, but it does not explicitly state the use of a separate validation set for any of the tasks. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models. |
| Software Dependencies | No | The paper mentions the use of Ada Grad as an optimization algorithm but does not specify versions for any other software libraries, frameworks, or programming languages used. |
| Experiment Setup | Yes | To train the models, we maximized the approximate log-likelihood using Ada Grad (Duchi et al., 2010) with a step size η = 0.2 and δ = 10 4. For each method, we set the beam size to 20. For forced decoding, we used a bigram model with exact inference to impute z. To test RCMS, we trained it in the same way using 20 contexts per position. We used the given plain text to learn the transition probabilities, using absolute discounting (Ney et al., 1994) for smoothing. Then, we used EM to learn the emission probabilities; we used Laplace smoothing for these updates. |