Semantic Re-tuning with Contrastive Tension
Authors: Fredrik Carlsson, Amaru Cuba Gyllensten, Evangelia Gogoulou, Erik Ylipää Hellqvist, Magnus Sahlgren
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Results from multiple common unsupervised and supervised STS tasks indicate that CT outperforms previous State Of The Art (SOTA), and when combining CT with supervised data we improve upon previous SOTA results with large margins. |
| Researcher Affiliation | Academia | Fredrik Carlsson Evangelia Gogoulou Erik Ylip a a Amaru Cuba Gyllensten Magnus Sahlgren RISE NLU Group {firstname.lastname}@ri.se |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and models is available at Github.com/Fredde Frallan/Contrastive-Tension |
| Open Datasets | Yes | Training data is randomly sampled from English Wikipedia (See Appendix C.2), where we collect K = 7 negative sentence pairs for each positive sentence pair. (Table 12: English https://dumps.wikimedia.org/enwiki/20200820/enwiki-20200820-pages-articles-multistream.xml.bz2) |
| Dataset Splits | Yes | These sentence embeddings are directly evaluated towards the STS-b test (Cer et al., 2017), without any additional training, from which we report the Spearman correlation between the cosine similarity of the embeddings and the manually collected similarity scores. The test partition of the dataset contains 1,379 sentence pairs... (Also, "Table 2 shows the test results of the model that performed best on the validation set.") |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using the "Huggingface API" and "Sent Eval package" but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Unless stated otherwise, the following set of hyperparameters is applied when using CT throughout all experiments: Training data is randomly sampled from English Wikipedia (See Appendix C.2), where we collect K = 7 negative sentence pairs for each positive sentence pair. The batch size is set to 16, which results in every batch having 2 positive sentence pairs and 14 negative sentence pairs. We apply an RMSProp optimizer (Hinton, 2012) with a fixed learning rate schedule that decreases from 1e 5 to 2e 6 (Appendix A.3). |