Efficiently Answering Technical Questions Ñ A Knowledge Graph Approach

Authors: Shuo Yang, Lei Zou, Zhongyuan Wang, Jun Yan, Ji-Rong Wen

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct comprehensive experiments on real-world dataset to evaluate the effectiveness and efficiency of our approach to answering technical questions. Our system outperforms main-stream search engine and the state-of-art information retrieval methods. Meanwhile, extensive experiments confirm the efficiency of our index-based online search mechanism.
Researcher Affiliation Collaboration Shuo Yang, Lei Zou, Zhongyuan Wang, Jun Yan, Ji-Rong Wen# Peking University Beijing Institute of Big Data Research Microsoft Research Asia #Renmin University of China & Beijing Key Laboratory of Big Data Management and Analysis Methods
Pseudocode No The paper describes algorithmic steps in prose (e.g., for materialized node selection and random walk) but does not include formal pseudocode blocks or algorithms labeled as such.
Open Source Code No The paper provides a link to a demo system: 'The demo of our system can be visited via http://59.108.48.29/Search.' However, it does not explicitly provide a link to the open source code for the methodology itself.
Open Datasets Yes We collected 1,176,328 uses question together with 111,062 different help documents that are clawed from the technical forum1. ... 1http://answers.microsoft.com/en-us/windows/forum
Dataset Splits No The paper describes evaluation sets (EVAL1 and EVAL2) and how they were collected ('randomly select 60,000 questions in EVAL1', 'randomly select 100 user questions...for EVAL2'), but it does not specify explicit training, validation, or test splits with percentages or counts for reproducing the experiments.
Hardware Specification No The paper does not specify any hardware details such as CPU, GPU models, or memory used for running the experiments.
Software Dependencies No The paper mentions 'SVM-Rank' for LETOR implementation ('SVM-Rank3, the radial basis function is used as the kernel function and the parameter c is set to 5.0.'), but does not provide a specific version number for SVM-Rank or other key software components used in their experiments.
Experiment Setup Yes For LETOR, we extract features of a question and two documents, that is, we use pairwise LETOR. Some pre-defined features provided by the official website2, and some other similarity features like (Clinchant and Gaussier 2010; Amati and Van Rijsbergen 2002; Fox and Shaw 1994) (features collected from both documents and similar questions) are also added. We choose negative samples randomly from the question set, the ratio of the number of positive and negative samples is varied from 1:3, 1:5 and 1:10. And the LETOR is implemented by SVM-Rank3, the radial basis function is used as the kernel function and the parameter c is set to 5.0.