DisenCite: Graph-Based Disentangled Representation Learning for Context-Specific Citation Generation

Authors: Yifan Wang, Yiping Song, Shuai Li, Chaoran Cheng, Wei Ju, Ming Zhang, Sheng Wang11449-11458

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the superior performance of our method comparing to state-of-the-art approaches. We further conduct ablation and case studies to reassure that the improvement of our method comes from generating the context-specific citation through incorporating the citation graph.
Researcher Affiliation Academia 1 School of Computer Science, Peking University, Beijing, China 2 National University of Defense Technology 3 Paul G. Allen School of Computer Science, University of Washington
Pseudocode No The paper describes its model components and logic using prose and mathematical equations but does not include a formal 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper states 'We release GCite1, a graph enhanced contextual citation dataset... 1https://github.com/jamesyifan/Disen Cite' but does not explicitly state that the source code for the methodology is available at this link.
Open Datasets Yes We construct a graph enhanced contextual citation dataset GCite, consisting of 25K relationships with different types... over 4.8K papers extracted from computer science domain of S2ORC (Lo et al. 2020). We release GCite1, a graph enhanced contextual citation dataset... 1https://github.com/jamesyifan/Disen Cite
Dataset Splits Yes We random select 80% of citation relations to constitute the training set, and treat the remaining 10%, 10% as the validation and test set respectively.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions software like 'Pytorch', 'GRU', and 'Adam optimizer' but does not specify their version numbers.
Experiment Setup Yes The word embeddings are randomly initialized with dimension d = 50. We limit the input document length to 600 tokens with each section (introduction, method and experiment) less than 200 and citation context length less than 50. For our method, we sample 2 hops of neighborhoods for the target node pair as subgraph with each number of type-specific neighbors are 5 and 4 respectively. The hyper-parameter α = 1, β = 1e 1, γ = 1e 1, and dropout with probability p = 0.35 is employed for all parameters to prevent overfitting. We optimize Disen Cite with Adam optimizer by setting the initial learning rate lr = 5e 3 and uses early stopping with a paticnce of 20, i.e. we stop training if ROUGE-L on the validation set dose not increase for 20 successive epochs.