reproducibilityindex.ai

Generating Recommendation Evidence Using Translation Model

Authors: Jizhou Huang, Shiqi Zhao, Shiqiang Ding, Haiyang Wu, Mingming Sun, Haifeng Wang

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experiments show that our method is domain independent, and can generate catchy and interesting evidences in the application of entity recommendation. The experimental results show that our approach is very promising.
Researcher Affiliation	Collaboration	Harbin Institute of Technology, Harbin, China Baidu Inc., Beijing, China {huangjizhou01, zhaoshiqi, dingshiqiang01, wuhaiyang, sunmingming01, wanghaifeng}@baidu.com
Pseudocode	No	No pseudocode or algorithm blocks are present in the paper.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	No	The paper mentions collecting data from "Baidu Baike" and "clickthrough data of Baidu web search engine" to construct the training corpus. It does not provide access information (link, DOI, etc.) for this specific dataset. "A total of 55,149,076 title-query aligned pairs were obtained, which we used as sentence-evidence parallel corpus to train our evidence generation model."
Dataset Splits	Yes	To construct the test set for the classiﬁer, we randomly sampled 10% of the data from positive and negative instances separately, and the 90% of data were left for training. We tune the parameters for the three EG models using the development data as described in Section 3.4 and evaluate them with the test data as described in Section 4.1.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions software tools like "Chinese word segmentation [Peng et al., 2004], POS tagging [Gimenez and Marquez, 2004] and dependency parsing [Mc Donald et al., 2006]" and "Maximum Entropy" for classification, but no specific version numbers are provided for these or any other software components.
Experiment Setup	Yes	We use a tri-gram language model in this work. We use a length-penalty function to generate short evidences whenever possible. We combine the four sub-models based on a log-linear framework and get the EG model: p(e\|s) = λtm... To estimate parameters λtm, λlm, λlf, λhl, and λss, we adopt the approach of minimum error rate training (MERT) that is popular in SMT [Och, 2003]. For the EG method proposed in this paper, we have trained three models. The ﬁrst EG model combines M1, M2, and M3... The second EG model combines M1, M2, M3, and M4-1... The third considers all sub-models M1, M2, M3, and M4...