Procedural Text Understanding via Scene-Wise Evolution

Authors: Jialong Tang, Hongyu Lin, Meng Liao, Yaojie Lu, Xianpei Han, Le Sun, Weijian Xie, Jin Xu11367-11375

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that SGR not only achieves the new state-of-the-art performance but also significantly accelerates the speed of reasoning. We conduct main experiments on Pro Para (Mishra et al. 2018) and auxiliary experiments on Recipes (Bosselut et al. 2018).
Researcher Affiliation Collaboration Jialong Tang1,3, Hongyu Lin1, Meng Liao4,*, Yaojie Lu1,3, Xianpei Han1,2, Le Sun1,2,*, Weijian Xie4, Jin Xu4 1 Chinese Information Processing Laboratory, Beijing, China 2 State Key Laboratory of Computer Science Institute of Software, Chinese Academy of Sciences, Beijing, China 3 University of Chinese Academy of Sciences, Beijing, China 4 Data Quality Team, We Chat, Tencent Inc., China {jialong2019,hongyu,yaojie2017,xianpei,sunle}@iscas.ac.cn {maricoliao, vikoxie, jinxxu}@tencent.com
Pseudocode Yes Algorithm 1: : State Reasoner. Input: SGR: the trained procedural text understanding model; Concept Net: the external commonsense knowledge base; P = {S1, S2, ..., ST }: the procedural text; E = {e1, e2, ..., e N}: pre-specified entities; Constraints: used for postprocessing; 1: The complete graph G Construct(P, E) ...
Open Source Code No The paper references an external GitHub link for a baseline model ('See details at https://github.com/ytyz1307zzh/NCET-Pro Para.'), but does not provide concrete access to the source code for the SGR methodology described in this paper.
Open Datasets Yes We conduct main experiments on Pro Para (Mishra et al. 2018) and auxiliary experiments on Recipes (Bosselut et al. 2018).
Dataset Splits Yes For Pro Para, we follow the official split (Mishra et al. 2018) for train/dev/test set. For Recipes, following the previous works (Zhang et al. 2020; Huang et al. 2021), we only use the human-labeled data in our experiments, and re-split it into 80%/10%/10% for train/dev/test sets.
Hardware Specification Yes The final model is trained on an Nvidia TITAN RTX GPU with Adam optimizer (Kingma and Ba 2015), and is selected with the highest prediction accuracy on dev set. We compare the inferencing time of SGR with NCET and IEN on an Nvidia TITAN RTX GPU in Table 4.
Software Dependencies Yes For context encoder, we use the BERT base implemented by Hugging Face s transformers library (Wolf et al. 2020). Specifically, we first extract the POS tags by flair (Akbik et al. 2019).
Experiment Setup Yes Hyper-parameters are manually tuned according to the accuracy on the dev set: batch size is set to 16, hidden size is set to 128 and learning rate is set to 5e-5. The final model is trained on an Nvidia TITAN RTX GPU with Adam optimizer (Kingma and Ba 2015), and is selected with the highest prediction accuracy on dev set.