reproducibilityindex.ai

Jointly Learning to Label Sentences and Tokens

Authors: Marek Rei, Anders Søgaard6916-6923

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that by learning to perform these tasks jointly on multiple levels, the model achieves substantial improvements for both sentence classiﬁcation and sequence labeling.
Researcher Affiliation	Academia	Marek Rei The ALTA Institute Computer Laboratory University of Cambridge United Kingdom marek.rei@cl.cam.ac.uk Anders Søgaard Co ASta L DIKU Department of Computer Science University of Copenhagen Denmark soegaard@di.ku.dk
Pseudocode	No	The paper describes the model using mathematical equations and textual descriptions but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code for running these experiments will be made publicly available.1 1http://www.marekrei.com/projects/mltagger
Open Datasets	Yes	We evaluate the joint labeling framework on three different tasks and datasets. The Co NLL 2010 shared task (Farkas et al. 2010) dataset [...] For error detection on both levels, we use the First Certiﬁcate in English (FCE, Yannakoudakis, Briscoe, and Medlock (2011)) dataset [...] Finally, we convert the Stanford Sentiment Treebank (SST, Socher, Perelygin, and Wu (2013))
Dataset Splits	No	The paper mentions using a 'development set' for early stopping and reports 'DEV F1' in its results tables, but it does not specify the exact percentages or sample counts for the training, validation, and test splits, nor does it explicitly reference how these splits were derived (e.g., standard predefined splits or custom splits with their proportions).
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU model, CPU type, memory) used for running its experiments.
Software Dependencies	No	The paper mentions optimizers (Ada Delta) and pre-trained embeddings (Glove) but does not provide specific version numbers for any software libraries or frameworks used in implementation (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	We combine all the different objective functions together using weighting parameters. [...] When using the full system, we use Λsent = Λtok = 1, ΛLM = Λchar = 0.1 and Λattn = 0.01. [...] Word embeddings were set to size 300, [...] The word-level LSTMs are size 300 and character-level LSTMs size 100; the hidden combined representation hi was set to size 200; the attention weight layer ei was set to size 100. The model was optimized using Ada Delta (Zeiler 2012) with learning rate 1.0. [...] Dropout (Srivastava et al. 2014) with probability 0.5 was applied [...] Training was stopped if performance on the development set had not improved for 7 epochs.