Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder
Authors: Caio Corro, Ivan Titov
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate effectiveness of our approach with experiments on English, French and Swedish. |
| Researcher Affiliation | Academia | Caio Corro Ivan Titov ILCC, School of Informatics, University of Edinburgh ILLC, University of Amsterdam c.f.corro@uva.nl ititov@inf.ed.ac.uk |
| Pseudocode | Yes | We report pseudo-codes for the forward and backward passes of our continuous relaxation of EISNER s algorithm in Appendix F. |
| Open Source Code | No | The paper does not provide a direct link to a source-code repository or explicitly state that the code for the described methodology is released. |
| Open Datasets | Yes | English We use the Stanford Dependency conversion (De Marneffe & Manning, 2008) of the Penn Treebank (Marcus et al., 1993) with the usual section split: 02-21 for training, 22 for development and 23 for testing. |
| Dataset Splits | Yes | English We use the Stanford Dependency conversion (De Marneffe & Manning, 2008) of the Penn Treebank (Marcus et al., 1993) with the usual section split: 02-21 for training, 22 for development and 23 for testing. |
| Hardware Specification | Yes | For English, the supervised parser took 1.5 hours to train on a NVIDIA Titan X GPU while the semi-supervised parser without sentence embedding, which sees 2 times more instances per epoch, took 3.5 hours to train. |
| Software Dependencies | No | The paper mentions using 'Adadelta (Zeiler, 2012) with default parameters as provided by the Dynet library (Neubig et al., 2017)', but does not provide specific version numbers for the Dynet library or other key software components to ensure reproducibility. |
| Experiment Setup | Yes | In the first two epochs, we train the network with the discriminative loss only. Then, for the next two epochs, we add the supervised ELBO term (Equation 5). Finally, after the 6th epoch, we also add the unsupervised ELBO term (Equation 3). We train our network using stochastic gradient descent for 30 epochs using Adadelta (Zeiler, 2012) with default parameters as provided by the Dynet library (Neubig et al., 2017). In the semi-supervised scenario, we alternate between labeled and unlabeled instances. The temperature of the PEAKED-SOFTMAX operator is fixed to τ = 1. |