Learning to Recommend Quotes for Writing
Authors: Jiwei Tan, Xiaojun Wan, Jianguo Xiao
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results show that, our proposed approach is appropriate for this task and it outperforms other recommendation methods. |
| Researcher Affiliation | Academia | Jiwei Tan and Xiaojun Wan and Jianguo Xiao Institute of Computer Science and Technology, Peking University, Beijing 100871, China The MOE Key Laboratory of Computational Linguistics, Peking University, Beijing 100871, China {tanjiwei, wanxiaojun, xiaojianguo}@pku.edu.cn |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | No | No explicit statement about releasing the authors' own source code or a direct link to it was found. The paper mentions and links to 'Rank Lib' (http://people.cs.umass.edu/~vdang/ranklib.html), but this is a third-party tool used by the authors, not their implementation code. |
| Open Datasets | Yes | We collected quotes from the website of Library of Quotes1. ... 1http://www.libraryofquotes.com ... In order to get real contexts of the quotes, we collected about 20GB raw texts of e-books from Project Gutenberg2 (Hart 1971) as corpus. ... 2http://www.gutenberg.org |
| Dataset Splits | Yes | The 64,323 context-quote pairs were randomly split, according to the proportion of 9:1:1, as training set, validation set and test set, respectively. |
| Hardware Specification | No | No specific hardware details (such as CPU/GPU models, memory, or cloud instance types) used for running the experiments were mentioned in the paper. |
| Software Dependencies | No | The paper mentions software like "Porter stemmer" and a "learning to rank tool called Rank Lib", but it does not provide specific version numbers for these components, which are required for reproducibility. |
| Experiment Setup | Yes | The parameters of Rank Lib we use are -norm zscore and metric2t NDCG@5... In our experiments we select the 1000 quotes with largest similarities to the query context as candidate quotes. ... we also randomly sample 4 negative examples for the training data... The dimension of latent semantic vectors is set to 1000. ... The number of topics is set to 1000. ... The dimension of explicit semantic vectors is 202037. ... The dimension of word vectors is set to 500. |