reproducibilityindex.ai

Procedural Text Understanding via Scene-Wise Evolution

Authors: Jialong Tang, Hongyu Lin, Meng Liao, Yaojie Lu, Xianpei Han, Le Sun, Weijian Xie, Jin Xu11367-11375

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that SGR not only achieves the new state-of-the-art performance but also signiﬁcantly accelerates the speed of reasoning. We conduct main experiments on Pro Para (Mishra et al. 2018) and auxiliary experiments on Recipes (Bosselut et al. 2018).
Researcher Affiliation	Collaboration	Jialong Tang1,3, Hongyu Lin1, Meng Liao4,, Yaojie Lu1,3, Xianpei Han1,2, Le Sun1,2,, Weijian Xie4, Jin Xu4 1 Chinese Information Processing Laboratory, Beijing, China 2 State Key Laboratory of Computer Science Institute of Software, Chinese Academy of Sciences, Beijing, China 3 University of Chinese Academy of Sciences, Beijing, China 4 Data Quality Team, We Chat, Tencent Inc., China {jialong2019,hongyu,yaojie2017,xianpei,sunle}@iscas.ac.cn {maricoliao, vikoxie, jinxxu}@tencent.com
Pseudocode	Yes	Algorithm 1: : State Reasoner. Input: SGR: the trained procedural text understanding model; Concept Net: the external commonsense knowledge base; P = {S1, S2, ..., ST }: the procedural text; E = {e1, e2, ..., e N}: pre-speciﬁed entities; Constraints: used for postprocessing; 1: The complete graph G Construct(P, E) ...
Open Source Code	No	The paper references an external GitHub link for a baseline model ('See details at https://github.com/ytyz1307zzh/NCET-Pro Para.'), but does not provide concrete access to the source code for the SGR methodology described in this paper.
Open Datasets	Yes	We conduct main experiments on Pro Para (Mishra et al. 2018) and auxiliary experiments on Recipes (Bosselut et al. 2018).
Dataset Splits	Yes	For Pro Para, we follow the ofﬁcial split (Mishra et al. 2018) for train/dev/test set. For Recipes, following the previous works (Zhang et al. 2020; Huang et al. 2021), we only use the human-labeled data in our experiments, and re-split it into 80%/10%/10% for train/dev/test sets.
Hardware Specification	Yes	The ﬁnal model is trained on an Nvidia TITAN RTX GPU with Adam optimizer (Kingma and Ba 2015), and is selected with the highest prediction accuracy on dev set. We compare the inferencing time of SGR with NCET and IEN on an Nvidia TITAN RTX GPU in Table 4.
Software Dependencies	Yes	For context encoder, we use the BERT base implemented by Hugging Face s transformers library (Wolf et al. 2020). Speciﬁcally, we ﬁrst extract the POS tags by ﬂair (Akbik et al. 2019).
Experiment Setup	Yes	Hyper-parameters are manually tuned according to the accuracy on the dev set: batch size is set to 16, hidden size is set to 128 and learning rate is set to 5e-5. The ﬁnal model is trained on an Nvidia TITAN RTX GPU with Adam optimizer (Kingma and Ba 2015), and is selected with the highest prediction accuracy on dev set.