Nugget: Neural Agglomerative Embeddings of Text

Authors: Guanghui Qin, Benjamin Van Durme

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate NUGGET outperforms related approaches in tasks involving semantic comparison. Finally, we illustrate these compact units allow for expanding the contextual window of a language model (LM), suggesting new future LMs that can condition on larger amounts of content.
Researcher Affiliation Academia 1Department of Computer Science, University of Johns Hopkins, USA.
Pseudocode No The paper describes the model's architecture and mathematical equations (e.g., equations 1-8 and Figure 2), but it does not contain a dedicated section labeled "Pseudocode" or "Algorithm", nor does it present structured steps in a code-like format.
Open Source Code No The paper states, "Those 2 datasets are released in https://github.com/hiaoxui/nugget-data" (Section 6.1.1, footnote 2), which refers to datasets, and mentions reliance on other open-source software, but it does not provide a direct link or explicit statement for the release of the source code for the NUGGET methodology itself.
Open Datasets Yes We build 2 document similarity test datasets based on the corpus of PARABANK (Hu et al., 2019) and Wiki Text-103 (Merity et al., 2016). Those 2 datasets are released in https://github.com/hiaoxui/nugget-data
Dataset Splits No The paper mentions training on a "training set" and evaluating on a "dev set" and "test set" for WMT19 and Wiki Text-103, but it does not specify the exact percentages or sample counts for these splits, nor does it cite predefined splits with specific details that would allow for reproduction of the data partitioning.
Hardware Specification Yes Every model is trained on 4 NVIDIA RTX 6000 GPUs with 24GB 4 GPU memory.
Software Dependencies No The paper lists key software such as "Py Torch", "Lightning AI", and "Huggingface Transformers" in the Acknowledgement section, but it does not specify their version numbers.
Experiment Setup Yes We explored different compression ratios r from 0.05 to 0.25. We freeze the bottom 3 layers (l = 3) in section 3.3 across our main experiments, and we provide a study of the effect of the number of frozen layers in section 7.1. We put more training details in appendix B.1.