reproducibilityindex.ai

Labeled Memory Networks for Online Model Adaptation

Authors: Shiv Shankar, Sunita Sarawagi

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate online model adaptation strategies on ﬁve sequence prediction tasks, an image classiﬁcation task, and two language modeling tasks. We show that LMNs are better than other MANNs designed for meta-learning. We also found them to be more accurate and faster than state-of-theart methods of retuning model parameters for adapting to domain-speciﬁc labeled data.
Researcher Affiliation	Academia	Shiv Shankar shiv shankar@iitb.ac.in IIT BombaySunita Sarawagi sunita@iitb.ac.in IIT Bombay
Pseudocode	Yes	The overall algorithm is depicted in Figure 2.
Open Source Code	Yes	code to be available on https://github.com/sshivs/LMN
Open Datasets	Yes	FSQNYC and FSQTokyo are Location Based Social Network data collected by (Yang et al. 2015) from Four Square of user check-in at various venues over an year. Brightkite (Cho, Myers, and Leskovec 2011) is a user check-in dataset made available as part of Stanford Network Analysis Project (Leskovec and Krevl 2014). Geolife (Zheng et al. 2009) is the trajectory data of people collected over multiple days... The Yoochoose dataset (Ben-Shimon et al. 2015) is the click event sessions... We use the popular omniglot dataset (Lake, Salakhutdinov, and Tenenbaum 2015). We compared on common language datasets Wikitext2 and Text8 with memory sizes 100 and 2000 as used in the previously published work.
Dataset Splits	Yes	In Table 1 we summarize the average length of each sequence, number of tokens, and the number of sequences in the training and test set.
Hardware Specification	Yes	We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research.
Software Dependencies	No	The paper mentions software components like 'Adam optimizer' and 'GRU' but does not provide specific version numbers for these or any other libraries or frameworks used.
Experiment Setup	Yes	In all experiments we used the Adam optimizer (Kingma and Ba 2014). The PCN is a GRU and the input is the embedding of the true observed token yt 1 at the previous time. In our experiments we used a decay value of 0.99. The margin is a hyper-parameter.