Learning to Recommend Quotes for Writing

Authors: Jiwei Tan, Xiaojun Wan, Jianguo Xiao

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results show that, our proposed approach is appropriate for this task and it outperforms other recommendation methods.
Researcher Affiliation Academia Jiwei Tan and Xiaojun Wan and Jianguo Xiao Institute of Computer Science and Technology, Peking University, Beijing 100871, China The MOE Key Laboratory of Computational Linguistics, Peking University, Beijing 100871, China {tanjiwei, wanxiaojun, xiaojianguo}@pku.edu.cn
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code No No explicit statement about releasing the authors' own source code or a direct link to it was found. The paper mentions and links to 'Rank Lib' (http://people.cs.umass.edu/~vdang/ranklib.html), but this is a third-party tool used by the authors, not their implementation code.
Open Datasets Yes We collected quotes from the website of Library of Quotes1. ... 1http://www.libraryofquotes.com ... In order to get real contexts of the quotes, we collected about 20GB raw texts of e-books from Project Gutenberg2 (Hart 1971) as corpus. ... 2http://www.gutenberg.org
Dataset Splits Yes The 64,323 context-quote pairs were randomly split, according to the proportion of 9:1:1, as training set, validation set and test set, respectively.
Hardware Specification No No specific hardware details (such as CPU/GPU models, memory, or cloud instance types) used for running the experiments were mentioned in the paper.
Software Dependencies No The paper mentions software like "Porter stemmer" and a "learning to rank tool called Rank Lib", but it does not provide specific version numbers for these components, which are required for reproducibility.
Experiment Setup Yes The parameters of Rank Lib we use are -norm zscore and metric2t NDCG@5... In our experiments we select the 1000 quotes with largest similarities to the query context as candidate quotes. ... we also randomly sample 4 negative examples for the training data... The dimension of latent semantic vectors is set to 1000. ... The number of topics is set to 1000. ... The dimension of explicit semantic vectors is 202037. ... The dimension of word vectors is set to 500.