reproducibilityindex.ai

Masking Orchestration: Multi-Task Pretraining for Multi-Role Dialogue Representation Learning

Authors: Tianyi Wang, Yating Zhang, Xiaozhong Liu, Changlong Sun, Qiong Zhang9217-9224

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed fine-tuned pretraining mechanism is comprehensively evaluated via three different dialogue datasets along with a number of downstream dialogue-mining tasks. Result shows that the proposed pretraining mechanism significantly contributes to all the downstream tasks without discrimination to different encoders.
Researcher Affiliation	Collaboration	1Alibaba Group, Hangzhou, Zhejiang, China 2Indiana University Bloomington, Bloomington, Indiana, USA {will.wty, ranran.zyt, qz.zhang}@alibaba-inc.com liu237@indiana.edu, changlong.scl@taobao.com
Pseudocode	No	The paper describes the model architecture and pretraining tasks in prose, but does not include formal pseudocode or algorithm blocks.
Open Source Code	No	To motivate other scholars to investigate this novel but important problem, we make the experiment dataset publicly available. https://github.com/wangtianyiftd/dialogue pretrain (The provided link is for the dataset, not explicitly the source code of the methodology.)
Open Datasets	Yes	To motivate other scholars to investigate this novel but important problem, we make the experiment dataset publicly available. https://github.com/wangtianyiftd/dialogue pretrain and CSD corpus5 is collected from the customer service center of a top E-commerce platform, which contains over 5 million customer service records between two roles (customer and agent) related to two product categories namely Clothes and Makeup. (Footnote 5 leads to https://sites.google.com/view/nlp-ssa) and EMD corpus is a combined dataset6 consisting of four open English meeting corpus: AMI-Corpus(Goo and Chen 2018), Switchboard Corpus(Jurafsky 2000), MRDA-Corpus(Shriberg et al. 2004) and b Ab I-Tasks-Corpus7. (Footnotes 6 & 7 also link to resources).
Dataset Splits	No	The paper mentions 'training data' but does not provide specific details on validation splits, percentages, or sample counts (e.g., '80/10/10 split' or '40,000 training samples').
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud computing specifications).
Software Dependencies	No	The paper mentions using 'Adam Optimization' and 'LSTM cell', but does not specify exact version numbers for any software dependencies, programming languages, or libraries used in the implementation.
Experiment Setup	Yes	In our experiments, we optimize the tested models using Adam Optimization(Kingma and Ba 2014) with learning rate of 5e-4. The dimensions of word embedding and role embedding are 300 and 100 respectively. The size of hidden layers are all set to 256. We use 2 layer Transformer-Block, where feed-forward ﬁlter size is 1024, and the number of heads equals to 4.