Deep Context: A Neural Language Model for Large-scale Networked Documents
Authors: Hao Wu, Kristina Lerman
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model on large-scale data collections that include Wikipedia pages, and scientific and legal citations networks. We demonstrate its effectiveness and efficiency on document classification and link prediction tasks. |
| Researcher Affiliation | Academia | Hao Wu USC ISI hwu732@usc.edu Kristina Lerman USC ISI lerman@isi.edu |
| Pseudocode | No | The paper describes algorithms textually and through mathematical equations, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any links or explicit statements about the availability of its source code. |
| Open Datasets | Yes | Wikipedia: A dump of Wikipedia pages1 in October 2015 is used in our experiments. 1http://dumps.wikimedia.org/enwiki/latest/ enwiki-latest-pages-articles.xml.bz2 DBLP: We download the DBLP data set [Tang et al., 2008]2, which contains a collection of papers with titles and citation links. 2http://arnetminer.org/lab-datasets/ citation/DBLP_citation_2014_May.zip Legal: We collect a large digitized record of federal court opinions from the Court Listener3 project in our study. 3https://www.courtlistener.com/ |
| Dataset Splits | Yes | The results are average over 5-fold cross-validation on the sampled data. |
| Hardware Specification | Yes | We perform experiments on a single machine with 64 CPU cores at 2.3 GHz, and 256G memory. |
| Software Dependencies | No | The paper mentions "Asynchronous stochastic gradient descent algorithm is used with 40 threads to optimize our models" but does not specify software names with version numbers for reproducibility. |
| Experiment Setup | Yes | The dimensionality of word and document vectors are fixed as 400 for all learning models. The number of negative sampling is fixed as 5 for Skip-gram, PV, LINE and DCV. We set the word context window size n = 5 in DCV-v LBL and b = 5 in DCV-iv LBL. The context window of document sequence m is fixed as 1 by which we only consider the immediate neighbors that the current document links to. |