reproducibilityindex.ai

Probing Product Description Generation via Posterior Distillation

Authors: Haolan Zhan, Hainan Zhang, Hongshen Chen, Lei Shen, Zhuoye Ding, Yongjun Bao, Weipeng Yan, Yanyan Lan14301-14309

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our model is superior to traditional generative models in both automatic indicators and human evaluation.
Researcher Affiliation	Collaboration	1Institute of Software, Chinese Academy of Sciences, Beijing, China 2Data Science Lab, JD.com, Beijing, China 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 4University of Chiense Academy of Sciences, Beijing, China
Pseudocode	No	The paper describes the model architecture and components textually and with a diagram, but it does not include pseudocode or a clearly labeled algorithm block.
Open Source Code	No	The paper mentions 'https://github.com/Open NMT/Open NMT-py' which is a third-party framework used, and 'https://github.com/jddsl/JDPDG' which is for the dataset. There is no explicit statement or link providing the source code for their proposed model.
Open Datasets	Yes	We collect a large-scale Chinese product description generation dataset, named as JDPDG from JD.com1, one of the biggest e-commerce platforms in China. This dataset contains 345,799 pairs of item content and description. https://github.com/jddsl/JDPDG
Dataset Splits	Yes	Table 2: Data statistics for our proposed JDPDG dataset. Category Shoes&Clothes Digital Homing Training Pairs 135,941 100,236 85,622 Validation Pairs 4000 4000 4000 Test Pairs 4000 4000 4000
Hardware Specification	Yes	We implement our model in Open NMT3 and train all models on the Tesla P40 GPUs with Pytorch (Paszke et al. 2019).
Software Dependencies	No	The paper mentions 'Open NMT' and 'Pytorch' as software used for implementation, but it does not specify any version numbers for these software components.
Experiment Setup	Yes	For experimental models, the hidden units of all transformer-based models are set as 512 and the feed-forward hidden size is set as 1,024. The beam search size is set as 5 and length penalty as α = 0.4 (Wu et al. 2016). For LSTM-based models, the word dimension is set to 300 and the hidden nodes are set as 256 for the encoder and decoder. The dropout rate and smoothing factor are set as 0.1 (Fabbri et al. 2019). The initial learning rate is set to 0.001. The β1 = 0.9 and β2 = 0.998 are used for gradient optimization. We also apply warm-up trick over the ﬁrst 8, 000 steps, and decay as in Vaswani et al. (2017). For hyper-parameters, we set γ1, β and α to 0.5, 0.4 and 0.5, respectively.