Relational Gating for ''What If'' Reasoning
Authors: Chen Zheng, Parisa Kordjamshidi
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that modeling pairwise relationships helps to capture higher-order relations and find the line of reasoning for causes and effects in the procedural descriptions. Our proposed approach achieves the state-of-the-art results on the WIQA dataset. |
| Researcher Affiliation | Academia | Chen Zheng and Parisa Kordjamshidi Michigan State University {zhengc12, kordjams}@msu.edu |
| Pseudocode | No | The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/HLR/RGN. |
| Open Datasets | Yes | WIQA dataset is available at http://data.allenai.org/wiqa/. |
| Dataset Splits | Yes | Data Train Dev Test V1 Test V2 Total Questions 29808 6894 3993 3003 43698 |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | We implemented RGN using Py Torch. We used Ro BERTa Base in our model. The paper mentions PyTorch and RoBERTa, but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | For each data sample, we keep 128 tokens as the max length for the question, and 256 tokens as the max length for paragraph contents. Notice that both gated entity representations for question and paragraph use k = 10 for selecting top-k entities in our experiments. The value of this hyper-parameter was selected after experimenting with various values in {3, 5, 7, 10, 15, 20} using the development dataset. For the Gated relation representations, top-10 ranked pairs are used to reduce the computational cost and reduce the unnecessary relations. In the relation gating process, we use two hidden layers for multi-layer perceptrons. The task-specific output classifier contains two MLP layers. The model is optimized using the Adam optimizer. The training batch size is 4. During training, we freeze the parameters of Ro BERTa in the first two epochs, and we stop the training after no performance improvements observed on the development dataset which happens after 8 epochs. |