What Happens Next? Event Prediction Using a Compositional Neural Network Model
Authors: Mark Granroth-Wilding, Stephen Clark
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate a range of systems that induce vector-space representations of events and use them to make predictions, comparing the results to the positive pointwise mutual information (PPMI) measure of Chambers and Jurafsky (2008, henceforth C&J08). ... The test set prediction accuracy of each of the models is shown in table 1. |
| Researcher Affiliation | Academia | Mark Granroth-Wilding and Stephen Clark {mark.granroth-wilding, stephen.clark}@cl.cam.ac.uk Computer Laboratory, University of Cambridge, UK |
| Pseudocode | No | The paper describes its models and methods in text and diagrams (Figure 4) but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Implementations of all the models and the evaluation, as well as the evaluation dataset split, are available at http://mark.granroth-wilding.co.uk/\papers/what happens next/. |
| Open Datasets | Yes | Following Chambers and Jurafsky (2008; 2009), we extract events from the NYT portion of the Gigaword corpus (Graff et al. 2003). ... Graff, D.; Kong, J.; Chen, K.; and Maeda, K. 2003. English Gigaword, LDC2003T05. Linguistic Data Consortium, Philadelphia. |
| Dataset Splits | Yes | We randomly select 10% of the documents in the corpus to use as a test set and 10% to use as a development set, the latter being used to compare architectures and optimize hyperparameters prior to evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU, CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software tools like 'C&C tools', 'Open NLP', 'word2vec', and 'Gensim implementation' but does not provide specific version numbers for any of them, which are required for reproducible software dependencies. |
| Experiment Setup | Yes | We train a skipgram model with hierarchical sampling, using a window size of 5 and vector size of 300. ... The input vector for each word is 300-dimensional. We use two hidden layers in the argument composition, with sizes 600 and 300, and two in the event composition, with sizes 400 and 200. Autoencoders were all trained with 30% dropout corruption for 2 iterations over the full training set, with a learning rate of 0.1 and λ = 0.001. Both subsequent training stages used a learning rate of 0.01 and λ = 0.018. The first (event composition only) was run for 3 iterations, the second (full network) for 8. All stages of training used SGD with 1,000-sized minibatches. |