Attention-based Iterative Decomposition for Tensor Product Representation
Authors: Taewon Park, Inchul Choi, Minho Lee
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we apply AID to several recent TPR-based or TPR equivalent approaches to show its effectiveness and flexibility.In all experimental results, AID shows effective generalization performance improvement for all TPR models. |
| Researcher Affiliation | Collaboration | Taewon Park1, Inchul Choi2, , Minho Lee1,2,* 1Kyungpook National University, South Korea 2ALI Co., Ltd., South Korea |
| Pseudocode | Yes | Algorithm 1 Attention-based Iterative Decomposition module. |
| Open Source Code | Yes | 1The code of AID is publicly available at https://github.com/taewonpark/AID |
| Open Datasets | Yes | SAR task: We use a set of arbitrary 1,000 words to construct each word set, as outlined in Table 15. bAbI task: The b Ab I task (Weston et al., 2015). Sort-of-CLEVR task: The Sort-of-CLEVR task (Santoro et al., 2017). WikiText-103 task: Wiki Text-103 task (Merity et al., 2016). |
| Dataset Splits | Yes | Wiki Text-103 task (Merity et al., 2016)... The training set consists of 28,475 articles, while the validation and test sets contain 60 articles each. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper mentions optimizers (Adam, RMSprop) and model architectures (LSTM), but does not provide specific software dependency details with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions) needed for replication. |
| Experiment Setup | Yes | SAR task: we utilize the Adam optimizer with a batch size of 64 and a learning rate of 1e 3, β1 of 0.9, and β2 of 0.98 for training iterations of 30K. ... sys-bAbI task: embedding size to 179. ... Adam optimizer with a batch size of 64 and a learning rate of 1e 3, β1 of 0.9, and β2 of 0.99 for 100 training epochs. ... Wiki Text-103 task: Adam optimizer with a batch size of 96, an initial learning rate of 2.5e 4, and a learning rate warmup step of 2,000 for 120 epochs. ... Tables 6, 7, and 8 show our module s hyper-parameter settings for each task. |