reproducibilityindex.ai

Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies

Authors: Paul Pu Liang, Manzil Zaheer, Yuan Wang, Amr Ahmed

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On text classiﬁcation, language modeling, and movie recommendation benchmarks, we show that ANT is particularly suitable for large vocabulary sizes and demonstrates stronger performance with fewer parameters (up to 40 compression) as compared to existing compression baselines.
Researcher Affiliation	Collaboration	Google Research, Carnegie Mellon University pliang@cs.cmu.edu, {manzilzaheer,yuanwang,amra}@google.com
Pseudocode	Yes	Algorithm 1 ANCHOR & TRANSFORM algorithm for learning sparse representations of discrete objects. and Algorithm 2 NBANT: Nonparametric Bayesian ANT.
Open Source Code	Yes	Code for our experiments can be found at https://github.com/pliang279/ sparse_discrete.
Open Datasets	Yes	AG-News (V = 62K) (Zhang et al., 2015), DBPedia (V = 563K) (Lehmann et al., 2015), Sogou-News (V = 254K) (Zhang et al., 2015), and Yelp-review (V = 253K) (Zhang et al., 2015)... Penn Treebank (PTB) (V = 10K) (Marcus et al., 1993) and Wiki Text-103 (V = 267K) (Merity et al., 2017)... Movie Lens 25M (Harper & Konstan, 2015)... Amazon Product reviews (Ni et al., 2019).
Dataset Splits	Yes	To perform optimization over the number of anchors, our algorithm starts with a small A = 10 and either adds anchors (i.e., adding a new row to A and a new column to T) or deletes anchors to minimize eq (5) at every epoch depending on the trend of the objective evaluated on validation set.
Hardware Specification	Yes	For each epoch on Movielens 25M, standard MF takes 165s on a GTX 980 Ti GPU while ANT takes 176s for A = 5 and 180s for A = 20.
Software Dependencies	No	The paper mentions using “Tensor Flow and Py Torch” and “NLTK” and the “YOGI optimizer”, but does not provide specific version numbers for these software components.
Experiment Setup	Yes	Here we provide more details for our experiments including hyperparameters used, design decisions, and comparison with baseline methods. We also include the anonymized code in the supplementary material. and tables like Table 6: Table of hyperparameters for text classiﬁcation experiments on AG-News, DBPedia, Sogou-News, and Yelp-review datasets.