The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
Authors: Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston
ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare a range of state-of-the-art models, each with a different way of encoding what has been previously read. We show that models which store explicit representations of long-term contexts outperform state-of-the-art neural language models at predicting semantic content words, although this advantage is not observed for syntactic function words. Interestingly, we find that the amount of text encoded in a single memory representation is highly influential to the performance: there is a sweet-spot, not too big and not too small, between single words and full sentences that allows the most meaningful information in a text to be effectively retained and recalled. Further, the attention over such window-based memories can be trained effectively through self-supervision. We then assess the generality of this principle by applying it to the CNN QA benchmark, which involves identifying named entities in paraphrased summaries of news articles, and achieve state-of-the-art performance. |
| Researcher Affiliation | Collaboration | Felix Hill , Antoine Bordes, Sumit Chopra & Jason Weston Facebook AI Research 770 Broadway New York, USA felix.hill@cl.cam.ac.uk,{abordes,spchopra,jase}@fb.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | For the lexical memory we use the code available at https://github.com/facebook/Mem NN. (This refers to code for a variant used, not the paper's specific methodology.) |
| Open Datasets | Yes | The CBT is built from books that are freely available thanks to Project Gutenberg.1 ... 2The dataset can be downloaded from http://fb.ai/babi/. |
| Dataset Splits | Yes | TRAINING VALIDATION TEST NUMBER OF BOOKS 98 5 5 NUMBER OF QUESTIONS (CONTEXT+QUERY) 669,343 8,000 10,000 AVERAGE WORDS IN CONTEXTS 465 435 445 AVERAGE WORDS IN QUERIES 31 27 29 DISTINCT CANDIDATES 37,242 5,485 7,108 VOCABULARY SIZE 53,628 |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | All models were implemented using the Torch library (see torch.ch). ... We trained an n-gram language model using the Ken LM toolkit (Heafield et al., 2013). ... (based on output from the POS tagger and named-entity-recogniser in the Stanford Core NLP Toolkit (Manning et al., 2014)). (No version numbers are provided for these software components). |
| Experiment Setup | Yes | Optimal hyper-parameter values on CBT: Embedding model (context+query): p = 300, λ = 0.01. ... Mem NNs (window memory + self-sup.): n = all, b = 5, λ = 0.01, p = 300. |