Modeling Hierarchical Structures with Continuous Recursive Neural Networks

Authors: Jishnu Ray Chowdhury, Cornelia Caragea

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we discuss our experiments and results. We evaluate our model on logical inference (Bowman et al., 2015b), list operations (List Ops) (Nangia & Bowman, 2018), sentiment analysis two datasets, SST2 and SST5 (Socher et al., 2013), and natural language inference two datasets, SNLI (Bowman et al., 2015a) and MNLI (Williams et al., 2018b).
Researcher Affiliation Academia 1Computer Science, University of Illinois at Chicago, United States. Correspondence to: Jishnu Ray Chowdhury <jraych2@uic.edu>, Cornelia Caragea <cornelia@uic.edu>.
Pseudocode Yes Algorithm 1 Continuous Recursive Neural Network
Open Source Code Yes 1Our code is available at: https://github.com/JRC1995/Continuous-RvNN
Open Datasets Yes We evaluate our model on logical inference (Bowman et al., 2015b), list operations (List Ops) (Nangia & Bowman, 2018), sentiment analysis two datasets, SST2 and SST5 (Socher et al., 2013), and natural language inference two datasets, SNLI (Bowman et al., 2015a) and MNLI (Williams et al., 2018b).
Dataset Splits Yes To evaluate CRv NN for length generalization, as in prior work, we train the model only on samples with 6 operations whereas we test it on samples with higher unseen number of operations (>= 7).
Hardware Specification Yes We trained both the models on 50 samples for 1 epoch and 1 batch size on an AWS P3.2 instance (Nvidia V100).
Software Dependencies No The paper mentions using Ge LU as an activation function and implies use of an optimizer like Adam (cited) but does not provide specific version numbers for software libraries or dependencies. It states: "For implementation details, refer to the appendix", but the appendix is not provided in the text.
Experiment Setup No The paper states "For implementation details, refer to the appendix." (Section 4). No specific hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed training configurations are explicitly stated in the provided main text.