reproducibilityindex.ai

Exploiting Sentence Embedding for Medical Question Answering

Authors: Yu Hao, Xien Liu, Ji Wu, Ping Lv938-945

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The comparison results show that our proposed framework achieved significant improvements compared to competitive baseline approaches. Additionally, a series of controlled experiments are also conducted to illustrate that the multi-scale strategy and the contextual self-attention layer play important roles for producing effective sentence embedding, and the two kinds of scoring strategies are highly complementary to each other for question answering problems.
Researcher Affiliation	Collaboration	1Department of Electronic Engineering, Tsinghua University, Beijing, China haoy15@mails.tsinghua.edu.cn, {xeliu, wuji ee }@mail.tsinghua.edu.cn 2Tsinghua-i Flytek Joint Laboratory, i Flytek Research, Beijing, China luping ts@mail.tsinghua.edu.cn
Pseudocode	No	No pseudocode or algorithm blocks are provided; figures illustrate frameworks, not structured algorithms.
Open Source Code	No	The paper does not provide an unambiguous statement or link to open-source code for the described methodology.
Open Datasets	No	The datasets used (Medical QA#1 NMLEC and Medical QA#2 CD-EMR) are stated to be collected by the authors from specific sources (NMLEC exam, EMRs from hospitals) and there is no indication or link provided for their public availability.
Dataset Splits	No	The paper only explicitly mentions training and test sets for both Medical QA#1 and Medical QA#2 datasets (e.g., 'totally 250,000 medical questions as the training set' and 'the test set has 6,000 questions' for QA#1; 'The training set has 75265 items, and the test set has 16551 items' for QA#2), but no specific validation split or set is described.
Hardware Specification	No	The paper mentions training on 'the GPU' but does not specify any particular GPU model, CPU, or other hardware components (e.g., 'NVIDIA A100', 'Intel Xeon').
Software Dependencies	No	The paper mentions 'Tensorflow (Abadi et al. 2016)' but does not provide a specific version number for Tensorflow or any other key software libraries.
Experiment Setup	Yes	The embedding s dimension is set to 200 for Medical QA#1 and 100 for Medical QA#2. [...] truncate all evidence documents and questions to no more than 100 words for Medical QA#1 and 70 words for Medical QA#2. For each candidate choice, only top 10 evidence documents are used to calculate the supportive score. The Bi-directional LSTM in the context layer has a dimension of 128. The size of attention encoding hidden state da (see Fig.3(b)) is 100. The number of semantics, r, is 15. Without any specification, in the multi-scale context layer of CAMSE framework, the size of convolution is 1,2,and 3. [...] We use Adam optimizer with exponential decay of learning rate and a dropout rate of 0.2 to reduce overfit, and the batch size is 10.