Task-Specific Representation Learning for Web-Scale Entity Disambiguation

Authors: Rijula Kar, Susmija Reddy, Sourangshu Bhattacharya, Anirban Dasgupta, Soumen Chakrabarti

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We report extensively on the accuracy of TSRL for the NED task over the standard Co NLL, MSNBC, AQUAINT and ACE data sets (Hoffart and others 2011; Guo and Barbosa 2016). In terms of both microand macro-averaged accuracy, TSRL surpasses standard MTL and MTRL approaches, as well as the best feature-engineered baselines, in most cases.
Researcher Affiliation Academia Rijula Kar,* Susmija Reddy,* Sourangshu Bhattacharya* Anirban Dasgupta, , Soumen Chakrabarti * IIT Kharagpur, India, IIT Gandhinagar, India, IIT Bombay, India rijula.cse@iitkgp.ac.in, {jsreddy,sourangshu}@cse.iitkgp.ernet.in, anirbandg@iitgn.ac.in, soumen@cse.iitb.ac.in
Pseudocode No The paper describes methods and models in prose but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available1. 1Visit https://github.com/rijula/tsrl-aaai18 and https://goo.gl/ vw6C6g
Open Datasets Yes We report extensively on the accuracy of TSRL for the NED task over the standard Co NLL, MSNBC, AQUAINT and ACE data sets (Hoffart and others 2011; Guo and Barbosa 2016). We used the alias-entity mapping indexes created by (Ganea and Hofmann 2017)3. The training corpus was collected from the November 2016 Wikipedia dump4.
Dataset Splits No The paper mentions training on various datasets and evaluating on a 'testb test fold' but does not explicitly provide details for a separate validation split, such as percentages or sample counts, needed for full reproducibility of data partitioning.
Hardware Specification Yes All experiments were implemented in Theano 0.8.2 and run on a few Xeon servers with 32 cores and 96 GB RAM each.
Software Dependencies Yes All experiments were implemented in Theano 0.8.2
Experiment Setup Yes For optimization, we used SGD with minibatches of 1000 mention instances and learning rate 1/k where k is the epoch number. Label predictions were made after averaging the model weights over the last 30 iterations, to remove noise. L2 regularizers were logarithmically grid-searched between 10 6 and 106, and reporting best accuracy achieved in the test dataset.