Beam Tree Recursive Cells

Authors: Jishnu Ray Chowdhury, Cornelia Caragea

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our proposed models in different out-of-distribution splits in both synthetic and realistic data. Our experiments show that BTCell achieves near-perfect performance on several challenging structure-sensitive synthetic tasks like List Ops and logical inference while maintaining comparable performance in realistic data against other Rv NN-based models.
Researcher Affiliation Academia 1Computer Science, University of Illinois Chicago. Correspondence to: Jishnu Ray Chowdhury <jraych2@uic.edu>, Cornelia Caragea <cornelia@uic.edu>.
Pseudocode Yes We present the pseudocode of the easy first composition in Algorithm 1 and the pseudocode of BT-cell in Algorithm 2.
Open Source Code Yes The code is available at: https://github.com/JRC1995/ Beam Tree Recursive Cells.
Open Datasets Yes List Ops (Nangia & Bowman, 2018; Williams et al., 2018a) is a challenging synthetic task that requires solving nested mathematical operations over lists of arguments. We present our results on List Ops in Table 1. ... Dataset Settings: SST5 (Socher et al., 2013) and IMDB (Maas et al., 2011) are natural language classification datasets (for sentiment classification). ... Dataset Settings: We ran our models on MNLI (Williams et al., 2018b) which is a natural language inference task.
Dataset Splits Yes We create 100, 000 training data with arguments 5, lengths 100, and depths 6. We create 2000 development data with arguments 5, lengths 100, and depths 7. We create 2000 test data with arguments 5, lengths 100, and depths 8-10.
Hardware Specification Yes All experiments are run in a single RTX A6000 GPU.
Software Dependencies No The paper mentions software components like GELU and PyTorch implicitly through citations or general discussion (e.g., "implemented with the Gated Recursive Cell (GRC)"), but it does not specify concrete version numbers for any software libraries or dependencies. For example, it mentions "a GELU (Hendrycks & Gimpel, 2016) activation function was used" and "Our models were trained on lengths 100, depth 20, and arguments 5."
Experiment Setup Yes For experiments with BT-Cell models, we set beam size as 5 as a practical choice (neither too big nor too small). ... For all recursive/recurrent models, we use a linear layer followed by layer normalization for initial leaf transformation... Overall we use the same boilerplate classifier architecture for classification and the same boilerplate sentnece-pair siamese architecture for logical inference... In practice, for BT-Cell, we use a stochastic top-k through gumbel perturbation... In terms of the optimizer, hidden size, and other hyperparameters besides dropout, we use the same ones as used by (Chowdhury & Caragea, 2021) for all models... Generally, we use a patience of 5 for the original List Ops training for all models, but we use a patience of 10 for CRv NN... For BSRP-Cell we use a beam size of 8... We use a dropout rate of 0.1 for logical inference... After tuning, for GRC-based models in SST5, we found a dropout rate of 0.4 for input/output dropout layers, and 0.2 for the dropout layer in the cell function. We found a dropout of 0.3 for LSTM-based models in SST5. and We share the hyperparameters of SST5 with IMDB. For MNLI, we used similar settings as Chowdhury & Caragea (2021). Batch size is set as 1.