reproducibilityindex.ai

Effective Slot Filling via Weakly-Supervised Dual-Model Learning

Authors: Jue Wang, Ke Chen, Lidan Shou, Sai Wu, Gang Chen13952-13960

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results demonstrate that our approach achieves better results than standard baselines on multiple datasets, especially in the low-resource setting. We evaluate the performance of our method on three different datasets, namely SNIPS (Coucke et al. 2018), ATIS (Hemphill, Godfrey, and Doddington 1990; Tur, Hakkani-T ur, and Heck 2010) and MIT Rest. (Liu et al. 2013).
Researcher Affiliation	Academia	1College of Computer Science and Technology, Zhejiang University 2State Key Laboratory of CAD&CG, Zhejiang University {zjuwangjue,chenk,should,wusai,cg}@zju.edu.cn
Pseudocode	No	The paper describes the model architecture and training process in text and diagrams (Figure 1 and Figure 2) but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/LorrinWWW/ weakly-supervised-slot-ﬁlling
Open Datasets	Yes	We evaluate the performance of our method on three different datasets, namely SNIPS (Coucke et al. 2018), ATIS (Hemphill, Godfrey, and Doddington 1990; Tur, Hakkani-T ur, and Heck 2010) and MIT Rest. (Liu et al. 2013).
Dataset Splits	Yes	We use the standard train-dev-test split for these datasets. For ATIS and MIT Rest., since they do not have a standard development set, we randomly pick 10% of the original training set as the development set. And for each run, we save the model checkpoint that achieves the highest F1 score on the dev set, and report its score on the test set.
Hardware Specification	No	The paper does not provide specific hardware details (like GPU/CPU models, processors, or memory) used for running its experiments.
Software Dependencies	No	The paper mentions software components like 'GloVe word vectors', 'BERT (bert-large-uncased)', and 'Adam' but does not specify their version numbers or any other software dependencies with versions.
Experiment Setup	Yes	For each mini-batch, we sample 30 utterances from labeled data and from weakly-labeled data. GloVe word vectors (Pennington, Socher, and Manning 2014) are used to initialize word embeddings, which are tuned during training. We also use BERT (bert-large-uncased, ﬁxed without ﬁne-tuning) to produce contextualized embeddings concatenated after the word embeddings. We set the hidden size to 200, and since we use bidirectional LSTMs, the hidden size for each LSTM is 100. We also apply 0.3 dropout after embeddings and LSTMs to mitigate the over-ﬁtting issue. We use Adam with a learning rate of 1e-3 to train the model.