Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
Authors: Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We train our models by automatically generating examples that illustrate the expected types of inference. and We evaluate our model in three different setups: and Table 1: Test set results for reasoning over hypernymy and meronymy relations. The models learn to reason with implicit rules, significantly improving on the hypothesis-only baseline, some in zero-shot. |
| Researcher Affiliation | Collaboration | Alon Talmor1,2 Oyvind Tafjord1 Peter Clark1 Yoav Goldberg1,3 Jonathan Berant1,2 1The Allen Institute for AI 2Tel-Aviv University, 3Bar-Ilan University |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All our code and data is publicly available at http://github.com/alontalmor/Leap Of Thought. |
| Open Datasets | Yes | Our primary motivation is to develop models that work in an open domain environment with real world facts. Thus, we automatically generate data by sampling from existing knowledge sources: CONCEPTNET [12], WORDNET [13] and WIKIDATA [14]. and fine-tune ROBERTA [8], on binary (yes/no) question answering tasks from two datasets (using standard multi-task training): (a) 50K examples from TWENTY QUESTIONS (20Q),1 a question answering (QA) dataset which includes questions such as Does an aircraft fly? (true) and Do everyone have an alarm? (false). This teaches the model to retrieve real world facts from its internal implicit knowledge; and (b) 100K examples from the RULETAKER [4] reasoning dataset, teaching the model to reason over a set of assertions explicitly provided as natural language statements. 1https://github.com/allenai/twentyquestions |
| Dataset Splits | Yes | We generate 30,906 training examples using this procedure. We create development and test sets, 1,289 examples each, where the subjects and objects are disjoint from the training set. and Overall 38,700/3,005/3,005 training/development/test examples were created. |
| Hardware Specification | No | The paper does not provide specific hardware details (like GPU or CPU models, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions models like ROBERTA and ESIM, but does not provide specific version numbers for any software dependencies or libraries used for the experiments. |
| Experiment Setup | No | The paper describes the input format and loss function used ('binary cross-entropy loss'), but it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or optimizer settings. |