Extracting Keyphrases from Research Papers Using Citation Networks

Authors: Sujatha Das Gollapalli, Cornelia Caragea

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally validate Cite Text Rank on several representative datasets and show statistically significant improvements over existing state-of-the-art models for keyphrase extraction.
Researcher Affiliation Academia Sujatha Das Gollapalli and Cornelia Caragea Computer Science and Engineering University of North Texas Email: gsdas@cse.psu.edu, ccaragea@unt.edu
Pseudocode No The paper describes the Cite Text Rank algorithm using definitions, textual descriptions of steps, and mathematical formulas (e.g., Equation 1 and 2), but it does not include a formal 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper states: 'All datasets are available upon request.' However, there is no explicit statement or link indicating that the source code for the proposed methodology (Cite Text Rank) is open-source or publicly available.
Open Datasets Yes We constructed three such datasets. The first two are proceedings of the last ten years of: (1) the ACM Conference on Knowledge Discovery and Data Mining (KDD), and (2) the World Wide Web Conference (WWW). The third dataset (referred to as UMD in this paper) was made available by Lise Getoor s research group at the University of Maryland2. 2http://www.cs.umd.edu/ sen/lbc-proj/LBC.html
Dataset Splits No The paper mentions parameter tuning and selecting best-performing settings for experiments, but it does not specify the exact percentages or absolute sample counts for training, validation, and test splits, nor does it detail a cross-validation setup or how the data was partitioned for these purposes.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments, such as CPU models, GPU models, memory, or cloud computing instance types.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks, or solvers) that were used to implement or run the experiments.
Experiment Setup Yes CTR has two sets of parameters, the window size w that determines how the edges are added between candidate word nodes in the graph and the λt values that determine the weight of each context type... Values 1-10 were tested for each parameter in steps of 1... where α is the damping factor typically set to 0.85 (Haveliwala et al. 2003).