reproducibilityindex.ai

Medical Exam Question Answering with Large-scale Reading Comprehension

Authors: Xiao Zhang, Ji Wu, Zhiyang He, Xien Liu, Ying Su

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments our Sea Reader achieved a large increase in accuracy on Med QA over competing models. and 5 Experiments Experimental Setup Word embedding Word embedding is trained on a corpus combining all text from the problems in the training set and the document collection using skip-gram (Mikolov et al. 2013). The dimension is set to 200. and Table 3: Results (accuracy) of Sea Reader and other approaches on Med QA task
Researcher Affiliation	Collaboration	Xiao Zhang,1 Ji Wu,1 Zhiyang He,1 Xien Liu,2 Ying Su2 1Department of Electronic Engineering, Tsinghua University 2Tsinghua-i Flytek Joint Laboratory, Tsinghua University
Pseudocode	No	The paper contains architectural diagrams and mathematical formulas but no blocks explicitly labeled "Pseudocode" or "Algorithm".
Open Source Code	No	The paper does not include any statement or link regarding the public availability of its source code.
Open Datasets	No	We propose the Med QA, our reading comprehension task on clinical medicine aiming at simulating a real-world scenario. and We assembled a large collection of text materials in Med QA as a source of information, to learn to read large-scale text. and We collected over 270,000 test problems from the internet and published materials such as exercise books. and We prepared text materials from a total of 32 publications including textbooks, reference books, guidebooks... The paper describes how the dataset was created and its characteristics, but does not provide access information for the final assembled dataset.
Dataset Splits	Yes	For training/test split, we created a test set as similar as possible to past exam problems... These problems are further split into valid/test sets. and Table 2: Data statistics training 222,323 valid 6,446 test 6,405
Hardware Specification	No	We used a batch size of 15, which already contains 750 documents per batch and is the maximum allowed to train on a single GPU. (No specific hardware model mentioned).
Software Dependencies	No	Our model is implemented using Tensorﬂow (Abadi et al. 2016). (No version number for TensorFlow or other libraries).
Experiment Setup	Yes	Word embedding... The dimension is set to 200. and All documents are truncated to no more than 100 words before processed by Sea Reader. and In most experiments, we retain 10 documents for each candidate answer, a total of 50 documents per problem. and Bidirectional LSTMs in the context layer and the reasoning layer all have a dimension of 128. and Adam optimizer is used with ϵ = 10 6 to stabilize training. Exponential decay of learning rate and dropout rate of 0.2 was used to reduce overﬁt. We used a batch size of 15...