Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

Authors: Meng Qu, Tianyu Gao, Louis-Pascal Xhonneux, Jian Tang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments to evaluate the proposed approach on two benchmark datasets of few-shot relation extraction. Empirical results prove the effectiveness of our proposed approach over many competitive baselines in both the settings of few-shot and zero-shot relation extraction.
Researcher Affiliation Academia 1Mila Quebec AI Institute, Montr eal, Canada 2University of Montr eal, Montr eal, Canada 3Tsinghua University, Beijing, China 4HEC Montr eal, Montr eal, Canada 5CIFAR AI Research Chair. Correspondence to: Meng Qu <meng.qu@umontreal.ca>, Jian Tang <jian.tang@hec.ca>.
Pseudocode Yes Algorithm 1 Training Algorithm
Open Source Code No The paper does not include an unambiguous statement or link indicating the release of its own source code for the described methodology. The provided links are for datasets.
Open Datasets Yes We use two benchmark datasets for evaluation. One dataset is the Few Rel dataset (Han et al., 2018; Gao et al., 2019)... The other dataset is NYT-25. The raw data of NYT-25 is from the official website of Few Rel 1 (Han et al., 2018; Gao et al., 2019)... 1 https://github.com/thunlp/Few Rel
Dataset Splits Yes For both datasets, the relations are from a knowledge graph named Wikidata... We randomly sample 10 relations for training, 5 for validation and the remaining 10 for test.
Hardware Specification No The paper does not provide specific hardware details (like exact GPU/CPU models or processor types) used for running its experiments.
Software Dependencies No The paper mentions software components like BERTBASE, SGD, and Graph Vite, but it does not specify version numbers for these or other key software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA) that would be needed for reproducibility.
Experiment Setup Yes In our approach, we use BERTBASE (Devlin et al., 2019) as encoder to encode all the tokens in a sentence. For the softmax function of the likelihood on support and query sentences, we apply an annealing temperature of 10. For the Gaussian prior of prototype vectors, we apply a one-layer graph convolutional network (Kipf & Welling, 2017) to the global relation graph to compute the mean. For the stochastic gradient Langevin dynamics, the number of samples to draw is set as 10 by default, which is the same as used by other Bayesian meta-learning methods, and we perform 5 steps of update for these samples with the initial step size (i.e., ϵ in Eq. (9)) as 0.1 by default. The graph encoder and sentence encoder are tuned by SGD with learning rate as 0.1.