reproducibilityindex.ai

FusionNet: Fusing via Fully-aware Attention with Application to Machine Comprehension

Authors: Hsin-Yuan Huang, Chenguang Zhu, Yelong Shen, Weizhu Chen

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply Fusion Net to the Stanford Question Answering Dataset (SQu AD) and it achieves the ﬁrst position for both single and ensemble model on the ofﬁcial SQu AD leaderboard at the time of writing (Oct. 4th, 2017). Meanwhile, we verify the generalization of Fusion Net with two adversarial SQu AD datasets and it sets up the new state-of-the-art on both datasets: on Add Sent, Fusion Net increases the best F1 metric from 46.6% to 51.4%; on Add One Sent, Fusion Net boosts the best F1 metric from 56.0% to 60.7%.
Researcher Affiliation	Collaboration	Hsin-Yuan Huang*1,2, Chenguang Zhu1, Yelong Shen1, Weizhu Chen1 1Microsoft Business AI and Research 2National Taiwan University momohuang@gmail.com, {chezhu,yeshen,wzchen}@microsoft.com
Pseudocode	No	The paper describes the architecture and processes using diagrams and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	An open-source implementation of Fusion Net can be found at https://github.com/momohuang/Fusion Net-NLI.
Open Datasets	Yes	We focus on the SQu AD dataset (Rajpurkar et al., 2016) to train and evaluate our model. SQu AD is a popular machine comprehension dataset consisting of 100,000+ questions created by crowd workers on 536 Wikipedia articles.
Dataset Splits	Yes	We focus on the SQu AD dataset (Rajpurkar et al., 2016) to train and evaluate our model.
Hardware Specification	Yes	On a single NVIDIA Ge Force GTX Titan X GPU, each epoch took roughly 20 minutes when batch size 32 is used.
Software Dependencies	No	The paper mentions software like "Py Torch" and "spa Cy" but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	Detailed experimental settings can be found in Appendix E. In Appendix E, it states: "The batch size is set to 32, and the optimizer is Adamax (Kingma & Ba, 2014) with a learning rate α = 0.002, β = (0.9, 0.999) and ϵ = 10 8. A ﬁxed random seed is used across all experiments. During training, we use a dropout rate of 0.4 (Srivastava et al., 2014) after the embedding layer (Glo Ve and Co Ve) and before applying any linear transformation."