reproducibilityindex.ai

Learning to Solve Constraint Satisfaction Problems with Recurrent Transformer

Authors: Zhun Yang, Adam Ishay, Joohyung Lee

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 EXPERIMENTS WITH RECURRENT TRANSFORMER, Table 1: Whole board accuracy on different Sudoku datasets., 4.1.1 ABLATION STUDY ON MODEL DESIGN (LXRYHZ) WITH TEXTUAL SUDOKU, 5.2 EXPERIMENTS ON INJECTING LOGICAL CONSTRAINTS IN RECURRENT TRANSFORMER TRAINING
Researcher Affiliation	Collaboration	Zhun Yang1, Adam Ishay1 & Joohyung Lee1,2, 1School of Computing and AI, Arizona State University, AZ, USA 2Global AI Center, Samsung Research, S. Korea
Pseudocode	No	The paper provides detailed mathematical formulations for the Recurrent Transformer architecture and describes its components, but it does not include a distinct block labeled 'Pseudocode' or 'Algorithm'.
Open Source Code	Yes	1The code is available at https://github.com/azreasoners/recurrent_transformer.
Open Datasets	Yes	For textual Sudoku, we use the SATNet dataset from (Wang et al., 2019) and the RRN dataset from (Palm et al., 2018). For visual Sudoku, we use the ungrounded SATNet-V dataset from (Topan et al., 2021). In addition to SATNet-V, we created a new ungrounded dataset, RRN-V, following the same procedure based on the RRN data set. MNIST. We use MNIST images (Le Cun et al., 1998) (http://yann.lecun.com/exdb/ mnist/)
Dataset Splits	Yes	We use the shortest path dataset SP4 from (Xu et al., 2018)...we split the dataset into 60%/20%/20% training/test/validation examples.
Hardware Specification	Yes	All of our experiments were done on Ubuntu 18.04.2 LTS with two 10-cores CPU Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz and four GP104 [Ge Force GTX 1080].
Software Dependencies	No	The paper mentions the operating system ('Ubuntu 18.04.2 LTS') and that their implementation is 'based on Andrej Karpathy s min GPT repository'. However, it does not specify version numbers for other key software dependencies such as programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or specific libraries.
Experiment Setup	Yes	F.2 TRAINING DETAILS, The values of the weights α and β of the constraint losses Lsudoku and Lattention are selected from {0, 0.1, 0.5, 1} to achieve the highest training accuracy. Table 8: Model Structure and Hyperparameters for Textual Sudoku Experiments (Batch size, Learning rate, Dropout, Number of attention heads, Number of layers, Number of recurrences, Embedding dimension, Token Embedder, Sequence length)