Logic and the 2-Simplicial Transformer

Authors: James Clift, Dmitry Doryn, Daniel Murfet, James Wallbridge

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that the simplicial agent confers an advantage over the relational agent as an inductive bias in our reasoning task.
Researcher Affiliation Academia James Clift Dmitry Doryn Daniel Murfet James Wallbridge {jamesedwardclift,dmitry.doryn,james.wallbridge}@gmail.com Department of Mathematics, University of Melbourne, d.murfet@unimelb.edu.au
Pseudocode Yes The pseudo-code for the ordinary and 2-simplicial Transformer blocks are: def transformer_block ( e ) : x = Layer Norm ( e ) a = 1 S i m p l i c i a l A t t e n t i o n ( x ) b = Dense Layer1 ( a ) c = Dense Layer2 ( b ) r = Add ( [ e , c ] ) eprime = Layer Norm ( r ) r e t u r n eprime def s i m p l i c i a l t r a n s f o r m e r b l o c k ( e ) : x = Layer Norm ( e ) a1 = 1 S i m p l i c i a l A t t e n t i o n ( x ) a2 = 2 S i m p l i c i a l A t t e n t i o n ( x ) a2n = Layer Norm ( a2 ) ac = Concatenate ( [ a1 , a2n ] ) b = Dense Layer1 ( ac ) c = Dense Layer2 ( c ) r = Add ( [ e , c ] ) eprime = Layer Norm ( r ) r e t u r n eprime
Open Source Code Yes The code for our implementation of both agents is available online (Clift et al., 2019).
Open Datasets No The environment in our reinforcement learning problem is a variant of the Box World environment from (Zambaldi et al., 2019). The paper describes how the environment is generated ('colours are uniformly sampled from a set of 20 colours and the boxes and loose keys are arranged randomly'), but does not provide a concrete access link (URL, DOI, specific repository) for a pre-existing dataset of this environment or its generated instances.
Dataset Splits No The paper describes how the environment episodes are generated for training and evaluation (e.g., 'half of the episodes contain a bridge, the solution length is uniformly sampled from [1, 3]'), but does not explicitly provide specific training/validation/test dataset splits with percentages, sample counts, or references to predefined splits.
Hardware Specification Yes Experiments were conducted either on the Google Cloud Platform with a single head node with 12 virtual CPUs and one NVIDIA Tesla P100 GPU and 192 additional virtual CPUs spread over two pre-emptible worker nodes, or on the University of Melbourne Nectar research cloud with a single head node with 12 virtual CPUs and two NVIDIA Tesla K80 GPUs, and 222 worker virtual CPUs.
Software Dependencies No The paper mentions software like Ray RLlib, Keras, and TensorFlow, but does not provide specific version numbers for these software dependencies (e.g., 'Ray RLlib (Liang et 2018)', 'Keras from (Mavreshko, 2019)').
Experiment Setup Yes The hyperparameters for IMPALA and RMSProp are given in Table 1 of Appendix E. Table 1: Hyperparameters for agent training. IMPALA entropy 5 10 3 Discount factor γ 0.99 Unroll length 40 timesteps Batch size 1280 timesteps Learning rate 2 10 4 RMSProp momentum 0 RMSProp ε 0.1 RMSProp decay 0.99