reproducibilityindex.ai

GLoMo: Unsupervised Learning of Transferable Relational Graphs

Authors: Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Russ R. Salakhutdinov, Yann LeCun

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that GLo Mo improves performance on various language tasks including question answering, natural language inference, and sentiment analysis. We also demonstrate that the learned graphs are generic enough to work with various sets of features on which the graphs have not been trained, including Glo Ve embeddings, ELMo embeddings, and taskspeciﬁc RNN states. We also identify key factors of learning successful generic graphs: decoupling graphs and features, hierarchical graph representations, sparsity, unit-level objectives, and sequence prediction. To demonstrate the generality of our framework, we further show improved results on image classiﬁcation by applying GLo Mo to model the relational dependencies between the pixels.
Researcher Affiliation	Collaboration	1Carnegie Mellon University, 2New York University, 3Facebook AI Research
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets	Yes	Question Answering The stanford question answering dataset [31](SQu AD) was recently proposed to advance machine reading comprehension. Natural Language Inference We chose to use the latest Multi-Genre NLI corpus (MNLI) [46]. Sentiment Analysis We use the movie review dataset collected in [22], with 25,000 training and 25,000 testing samples crawled from IMDB. Image Classiﬁcation... We leverage the entire Image Net [11] dataset and have the images resized to 32x32 [27]. In the transfer phase, we chose CIFAR-10 classiﬁcation as our target task.
Dataset Splits	Yes	Table 3: CIFAR-10 classiﬁcation results. We adopt a 42,000/8,000 train/validation split once the best model is selected according to the validation error, we directly forward it to the test set without doing any validation set place-back retraining.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used for its experiments (e.g., CPU/GPU models, memory specifications).
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	Here D is a hyper-parameter called the context length. In our implementation, at position t, in addition to predicting the forward context (xt+1, , xt+D), we also use a separate network to predict the backward context (xt D, , xt 1), similar to [30]. We also adopt the multi-head attention [42] to produce multiple graphs per layer. We only used horizontal ﬂipping for data augmentation.