DisenCite: Graph-Based Disentangled Representation Learning for Context-Specific Citation Generation
Authors: Yifan Wang, Yiping Song, Shuai Li, Chaoran Cheng, Wei Ju, Ming Zhang, Sheng Wang11449-11458
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the superior performance of our method comparing to state-of-the-art approaches. We further conduct ablation and case studies to reassure that the improvement of our method comes from generating the context-specific citation through incorporating the citation graph. |
| Researcher Affiliation | Academia | 1 School of Computer Science, Peking University, Beijing, China 2 National University of Defense Technology 3 Paul G. Allen School of Computer Science, University of Washington |
| Pseudocode | No | The paper describes its model components and logic using prose and mathematical equations but does not include a formal 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper states 'We release GCite1, a graph enhanced contextual citation dataset... 1https://github.com/jamesyifan/Disen Cite' but does not explicitly state that the source code for the methodology is available at this link. |
| Open Datasets | Yes | We construct a graph enhanced contextual citation dataset GCite, consisting of 25K relationships with different types... over 4.8K papers extracted from computer science domain of S2ORC (Lo et al. 2020). We release GCite1, a graph enhanced contextual citation dataset... 1https://github.com/jamesyifan/Disen Cite |
| Dataset Splits | Yes | We random select 80% of citation relations to constitute the training set, and treat the remaining 10%, 10% as the validation and test set respectively. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like 'Pytorch', 'GRU', and 'Adam optimizer' but does not specify their version numbers. |
| Experiment Setup | Yes | The word embeddings are randomly initialized with dimension d = 50. We limit the input document length to 600 tokens with each section (introduction, method and experiment) less than 200 and citation context length less than 50. For our method, we sample 2 hops of neighborhoods for the target node pair as subgraph with each number of type-specific neighbors are 5 and 4 respectively. The hyper-parameter α = 1, β = 1e 1, γ = 1e 1, and dropout with probability p = 0.35 is employed for all parameters to prevent overfitting. We optimize Disen Cite with Adam optimizer by setting the initial learning rate lr = 5e 3 and uses early stopping with a paticnce of 20, i.e. we stop training if ROUGE-L on the validation set dose not increase for 20 successive epochs. |