Learning Invariants through Soft Unification

Authors: Nuri Cingillioglu, Alessandra Russo

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on five datasets to demonstrate that learning invariants captures patterns in the data and can improve performance over baselines. and We probe three aspects of soft unification: the impact of unification on performance over unseen data, the effect of multiple invariants and data efficiency. To that end, we train UMLP, UCNN and URNN with and without unification and UMN with pre-training using 1 or 3 invariants over either the entire training set or only 50 examples.
Researcher Affiliation Academia Nuri Cingillioglu Imperial College London nuric@imperial.ac.uk Alessandra Russo Imperial College London a.russo@imperial.ac.uk
Pseudocode Yes Algorithm 1: Unification Networks
Open Source Code Yes Our implementation using Chainer [45] is publicly available at https://github.com/nuric/softuni with the accompanying data.
Open Datasets Yes We use five datasets consisting of context, query and an answer (C, Q, a) (see Table 1 and Appendix B for further details) with varying input structures: fixed or varying length sequences, grids and nested sequences (e.g. stories). and The b Ab I dataset consists of 20 synthetically generated natural language reasoning tasks (refer to [48] for task details). and To evaluate on a noisy real-world dataset, we take the sentiment analysis task from [41] and prune sentences to a maximum length of 20 words.
Dataset Splits Yes The training is then performed over a 5-fold cross-validation. and We take the 1k English set and use 0.1 of the training set as validation. and We generate 1k and 50 logic programs per task for training with 0.1 as validation and another 1k for testing.
Hardware Specification Yes Every model is trained via back-propagation using Adam [22] with learning rate 0.001 on an Intel Core i7-6700 CPU
Software Dependencies No Our implementation using Chainer [45] is publicly available at https://github.com/nuric/softuni with the accompanying data. (It names Chainer but no version number, nor other dependencies with versions).
Experiment Setup Yes Every model is trained via back-propagation using Adam [22] with learning rate 0.001 on an Intel Core i7-6700 CPU using the following objective function: Original output z }| { λKLnll(f) +λI[ Unification output z }| { Lnll(f g) + Sparsity z }| { τ X s S ψ(s) ] (1) where Lnll is the negative log-likelihood with sparsity regularisation over ψ at τ = 0.1 and Full details the of models, including hyper-parameters, are available in Appendix A.