Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder

Authors: Caio Corro, Ivan Titov

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate effectiveness of our approach with experiments on English, French and Swedish.
Researcher Affiliation Academia Caio Corro Ivan Titov ILCC, School of Informatics, University of Edinburgh ILLC, University of Amsterdam c.f.corro@uva.nl ititov@inf.ed.ac.uk
Pseudocode Yes We report pseudo-codes for the forward and backward passes of our continuous relaxation of EISNER s algorithm in Appendix F.
Open Source Code No The paper does not provide a direct link to a source-code repository or explicitly state that the code for the described methodology is released.
Open Datasets Yes English We use the Stanford Dependency conversion (De Marneffe & Manning, 2008) of the Penn Treebank (Marcus et al., 1993) with the usual section split: 02-21 for training, 22 for development and 23 for testing.
Dataset Splits Yes English We use the Stanford Dependency conversion (De Marneffe & Manning, 2008) of the Penn Treebank (Marcus et al., 1993) with the usual section split: 02-21 for training, 22 for development and 23 for testing.
Hardware Specification Yes For English, the supervised parser took 1.5 hours to train on a NVIDIA Titan X GPU while the semi-supervised parser without sentence embedding, which sees 2 times more instances per epoch, took 3.5 hours to train.
Software Dependencies No The paper mentions using 'Adadelta (Zeiler, 2012) with default parameters as provided by the Dynet library (Neubig et al., 2017)', but does not provide specific version numbers for the Dynet library or other key software components to ensure reproducibility.
Experiment Setup Yes In the first two epochs, we train the network with the discriminative loss only. Then, for the next two epochs, we add the supervised ELBO term (Equation 5). Finally, after the 6th epoch, we also add the unsupervised ELBO term (Equation 3). We train our network using stochastic gradient descent for 30 epochs using Adadelta (Zeiler, 2012) with default parameters as provided by the Dynet library (Neubig et al., 2017). In the semi-supervised scenario, we alternate between labeled and unlabeled instances. The temperature of the PEAKED-SOFTMAX operator is fixed to τ = 1.