Top-Down Neural Model For Formulae

Authors: Karel Chvalovský

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 3 EXPERIMENTS
Researcher Affiliation Academia Karel Chvalovsk y Czech Institute of Informatics, Robotics, and Cybernetics Czech Technical University in Prague karel@chvalovsky.cz
Pseudocode No The paper describes its methods in prose and diagrams but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper provides a link to a dataset used, but does not include any statement or link for the source code of the described methodology.
Open Datasets Yes To provide comparable results in Section 3 we use the dataset6 presented in Evans et al. (2018) and thoroughly described therein. ... 6It can be obtained from https://github.com/deepmind/logical-entailment-dataset.
Dataset Splits Yes The dataset contains train (99876 triples), validation (5000), and various test sets with a different level of difficulty given by a number of atoms and connectives7 occurring in them.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only mentioning general training parameters.
Software Dependencies No The paper mentions software components like GRU, LSTM, and Adam optimizer, but does not provide specific version numbers for any libraries or frameworks used (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes The following implementation of the model introduced in Section 2 is our standard experimental model, called Top Down Net: w Rd is a learned vector, every ci is a sequence of linear, ReLU, linear, and ReLU layers, where the input size and output size is always the same with the exception of binary connectives, where the last linear layer is Rd R2d, RNN-Var is a gated recurrent unit (GRU) with 2 recurrent layers and the size of the input and hidden state is d, RNN-All is a GRU with 1 recurrent layer and the size of the input and hidden state is d, Final is a sequence of linear (Rd Rd/2), ReLU, linear (Rd/2 R2), and log softmax layers, and we use the mean square error as a loss function and Adam as an optimizer with the learning rate 10 4. A key parameter of our model is the dimension d...