reproducibilityindex.ai

Commonsense Knowledge Base Completion with Structural and Semantic Context

Authors: Chaitanya Malaviya, Chandra Bhagavatula, Antoine Bosselut, Yejin Choi2925-2933

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide the first empirical results for KB completion on ATOMIC and evaluation with ranking metrics on Concept Net. Our results demonstrate the effectiveness of language model representations in boosting link prediction performance and the advantages of learning from local graph structure (+1.5 points in MRR for Concept Net) when training on subgraphs for computational efﬁciency. Further analysis on model predictions shines light on the types of commonsense knowledge that language models capture well.1
Researcher Affiliation	Collaboration	Allen Institute for Artiﬁcial Intelligence University of Washington {chaitanyam, chandrab}@allenai.org, {antoineb, yejin}@cs.washington.edu
Pseudocode	No	The paper describes the model architecture and equations but does not provide structured pseudocode or an algorithm block.
Open Source Code	Yes	Code and dataset are available at github.com/allenai/ commonsense-kg-completion.
Open Datasets	Yes	We focus our experiments on two prominent knowledge graphs: Concept Net and ATOMIC. Statistics for both graphs are provided in Table 1, along with FB15K-237 a standard KB completion dataset. ... 2https://ttic.uchicago.edu/ kgimpel/commonsense.html 3https://homes.cs.washington.edu/ msap/atomic/
Dataset Splits	Yes	We used the original splits from the dataset, and combined the two provided development sets to create a larger development set. The development and test sets consisted of 1200 tuples each. ... The original dataset split was created to make the set of seed entities between the training and evaluation splits mutually exclusive. Since the KB completion task requires entities to be seen at least once, we create a new random 80-10-10 split for the dataset. The development and test sets consisted of 87K tuples each.
Hardware Specification	Yes	For instance, the model with GCN and BERT representations for ATOMIC occupies 30GB memory and takes 8-10 days for training on a Quadro RTX 8000 GPU.
Software Dependencies	No	The paper mentions 'Deep Graph Library (DGL)' for implementation and finetuning 'BERT', but does not provide specific version numbers for these software components.
Experiment Setup	Yes	BERT Fine-tuning We used a maximum sequence length of 64, batch size of 32, and learning rate of 3e-5 to ﬁne-tune the uncased BERT-Large model with the masked language modeling objective. The warmup proportion was set to 0.1. ... The graph convolutional network used 2 layers... an input and output embedding dimension of 200. The graph batch size used for subgraph sampling was 30000 edges. For the Conv Trans E decoder, we used 500 channels, a kernel size of 5 and a batch size of 128. Dropout was enforced at the feature map layers, the input layer and after the fully connected layer in the decoder, with a value of 0.2. The Adam optimizer was used for optimization with a learning rate of 1e-4 and gradient clipping was performed with a max gradient norm value of 1.0. We performed L2 weight regularization with a weight of 0.1. We also used label smoothing with a value of 0.1.